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ABSTRACT 


We  have  developed  a  combined  statistical-dynamical  prediction  scheme 
to  predict  the  probability  of  tropical  cyclone  (TC)  formation  at  daily,  2.5° 
horizontal  resolution  across  the  western  North  Pacific  at  intraseasonal  lead 
times.  Through  examination  of  previous  research  and  our  own  analysis,  we 
chose  five  variables  to  represent  the  favorability  of  the  climate  system  to  support 
tropical  cyclogenesis.  These  so-called  large-scale  environmental  factors 
(LSEFs)  include:  low-level  relative  vorticity,  sea  surface  temperature,  vertical 
wind  shear,  Coriolis,  and  upper-level  divergence.  Logistic  regression  was 
employed  to  generate  a  statistical  model  representing  the  probability  of  TC 
formation  at  every  grid  point  based  on  these  LSEFs.  Thorough  verification  of 
zero-lead  hindcasts  reveals  this  model  displays  skill  and  potential  value  for  risk 
adverse  customers.  In  particular,  these  hindcasts  had  a  positive  Brier  skill  score 
of  0.03  and  a  skillful  relative  operating  characteristic  skill  score  of  0.68.  The  fully 
coupled,  one-tier  NCEP  Climate  Forecast  System  was  used  as  the  dynamical 
model  with  which  to  forecast  the  LSEFs  and,  in  turn,  force  the  regression  model. 
A  series  of  individual  TC  case  studies  were  conducted  to  demonstrate  the 
predictive  potential,  at  intraseasonal  leads,  of  our  statistical-dynamical  method. 
Lastly,  we  investigated  the  applicability  of  intraseasonal  forecasts  to  military 
planning. 
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I.  INTRODUCTION 


A.  MOTIVATION 

Three  months  before  the  kickoff  of  the  Valiant  Shield  (VS)  naval  exercise, 
a  group  of  U.  S.  Navy  planners  gathers  in  a  small  conference  room  at  Pearl 
Harbor  to  compare  notes.  The  meeting  scrubs  the  logistics  and  rules  of 
engagement  for  this  large  scale,  joint  forces  event  held  in  the  tropical  western 
North  Pacific  region  near  Guam.  Hours  later,  an  environment-savvy  planner 
questions,  “is  the  weather  going  to  cooperate?”  He  continues,  “How  might 
tropical  cyclones  affect  the  ability  of  the  different  platforms  to  operate  in  the 
designated  exercise  area  and  period?” 

This  meeting  may  be  hypothetical,  but  those  questions  are  exactly  the 
type  that  military  planners  should  be  asking  and  that  Department  of  Defense 
(DoD)  weather  centers  should  be  capable  of  answering  with  confidence. 
Unfortunately,  no  suitable  products  currently  exist  to  answer  such  questions. 
Such  mission  planning  well  in  advance  of  the  operation(s)  is  not  unusual  in  the 
DoD.  Though  this  example  depicts  a  complex  exercise,  the  same  environmental 
intelligence  should  be  exploited  for  a  multitude  of  missions,  such  as  planning 
flight  qualification  training  at  long  leads  or  establishing  a  CORONET  trans¬ 
oceanic  air  bridge. 

A  gap  clearly  exists  in  DoD  weather  support  for  forecasts  with  lead  times 
on  the  order  of  weeks  to  months.  Consider  the  potential  benefit — in  dollars, 
hours,  morale,  etc. — if  weather  forecasters  were  able  to  provide  those  planners 
with  a  regional  outlook  for  tropical  cyclone  activity  and  an  idea  of  avenues  of  safe 
passage  through  the  western  North  Pacific.  This  thesis  will  investigate  one  such 
approach  that  would  benefit  this  scenario. 


1 


B.  CLIMATE  PREDICTION  PROCESS 


1.  Syntax  and  Definitions 

Below  are  definitions  for  and  discussions  of  some  key  terms  that  are  used 
in  this  paper. 


a.  Climatology 

While  climatology  literally  refers  to  the  description  and  scientific 
study  of  climate  (Glickman  2000),  this  term  is  used  in  this  work  to  refer  to  a 
quantitative  description  of  an  element  in  terms  of  a  long  term  average;  for 
example,  the  frequency  of  cyclone  formation  for  a  given  grid  box  in  a  region  for  a 
given  unit  of  time.  Climatology  is  also  used  throughout  this  thesis  as  the  baseline 
forecast  against  which  our  methods  will  be  compared.  Appendix  A  includes  a 
more  in-depth  discussion  on  the  variations  in  the  methods  used  to  calculate 
climatologies. 


b.  Intraseasonal 

Used  in  reference  to  a  subset  of  forecast  products  and  associated 
lead  times,  intraseasonal  comprises  a  period  bounded  by  a  single  season  or 
other  three  month  period.  Often  referred  to  as  long-range  forecasting, 
subseasonal  forecasting,  or  short-term  climate  prediction,  the  lead  times  for 
intraseasonal  products  and  techniques  are  typically  longer  than  two  weeks,  but 
less  than  three  months. 

c.  Prediction 

The  word  prediction  is  readily  used  interchangeably  with  a  form  of 
the  word  forecast.  Though  such  use  may  be  grammatically  correct,  we  use  the 
word  prediction  to  denote  a  quantitative  scientific  estimate  of  future  climate 
conditions  that  has  skill.  A  forecast,  in  contrast,  is  used  to  refer  to  both  the 
prediction  process,  regardless  of  perceived  skill,  and  the  deliverable  that  results. 
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The  difference  here  between  forecast  and  prediction  may  be  more 
psychological  than  meteorological.  The  customers  for  forecasts  and  predictions 
(e.g.,  military  operators,  the  general  public,  etc.)  expect  that  weather  forecasts 
are  readily  available  (e.g.,  a  five-day  forecast  from  the  local  news  media).  In 
contrast,  a  description  of  the  future  state  of  the  climate  system  may  be  best 
thought  of  as  a  prediction  that  is  issued  only  if  the  prediction  has  some  perceived 
skill  beyond  a  baseline  forecasts  (e.g.,  a  forecast  of  climatological  conditions).  In 
that  context,  a  customer  should  not  always  expect  a  prediction  that  varies  from 
the  long-term  mean  (LTM)  for  temperature  over  the  forthcoming  summer  in  the 
same  way  he  expects  a  local  forecast  for  tomorrow’s  high  temperature. 

d.  Tropical  Cyclone 

In  the  most  general  form,  tropical  cyclone  refers  to  a  closed, 
cyclonic  circulation  with  its  origins  over  a  tropical  ocean  basin.  Tropical  cyclones 
(TCs)  are  classified  according  to  their  intensity,  and  these  classifications  vary 
somewhat  by  ocean  basin.  In  the  western  North  Pacific  (WNP),  a  tropical 
depression  is  characterized  by  winds  up  to  17  ms'1,  a  tropical  storm  has  winds  of 
18  ms'1  to  32  ms'1,  a  typhoon  has  winds  33  ms'1  to  66  ms"1,  and  a  super  typhoon 
has  winds  that  exceed  66  ms"1. 

2.  Elements  of  Operational  Climate  Prediction 

The  basis  of  this  thesis  is  the  exploratory  development  of  a  state-of-the- 
science  climate  prediction  system  for  likely  TC  formation  areas  in  a  given 
geographical  region.  Though  the  idea  of  climate  prediction — intraseasonal  or 
otherwise — is  not  new,  no  available  resources  clearly  outline  the  prediction 
process. 
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Figure  1 .  Schematic  of  the  climate  prediction  process. 


Figure  1  provides  a  schematic  description  of  a  state-of-the-science 
approach  to  developing  an  operational  climate  predictions.  As  presented,  this 
process  is  generic  and  may  be  applied  to  various  meteorological  or 
oceanographic  elements  and  over  various  time  scales.  The  flow  of  the  arrows  in 
the  diagram  indicates  that  the  process  is  fluid,  and  often  iterative  in  nature. 
Though  the  process  may  vary  somewhat  in  specific  cases,  the  depicted  steps  are 
all  important  to  the  development  of  an  operational  deliverable. 

3.  Methods  of  Prediction 

Though  the  Forecast  Method  Development  step  is  only  one  step  in  the 
process  depicted  in  Figure  1,  the  development  of  the  forecast  method  is  likely  the 
most  challenging  component  of  the  climate  prediction  process.  Three  primary 
categories  of  predictive  methods  exist  in  operational  intraseasonal/seasonal 
forecasting:  statistical,  dynamical,  and  a  combined  statistical-dynamical 

approach. 
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a.  Statistical 

Whether  the  approach  is  projecting  average  conditions, 
constructing  analogues,  or  applying  empirical  orthogonal  functions,  statistical 
methods  are  widely  used  in  climate  prediction.  These  and  many  other  statistical, 
also  referred  to  as  empirical,  methods  use  existing  data  sets  in  order  to  develop 
predictive  methods  based  on  past  conditions.  Such  methods  are  mainstays  for 
intraseasonal  and  seasonal  climate  prediction  at  the  National  Weather  Service’s 
Climate  Prediction  Center  (CPC)  and  other  climate  prediction  centers  (van  den 
Dool  2007). 


b.  Dynamical 

Numerical  weather  prediction  may  be  the  standard  for  day-to-day 
weather  forecasting,  but  dynamical  methods  in  intraseasonal  and  seasonal 
climate  prediction  are  often  less  skillful  than  comparable  statistical  methods.  Van 
den  Dool  (2007)  cites  that  in  2005,  the  National  Centers  for  Environmental 
Prediction  (NCEP)  presented  an  award  to  a  group  of  its  employees  for 
developing  the  Climate  Forecast  System  (CFS;  to  be  discussed  in  Section  II. B. 4) 
that  led  to  “the  first  time  in  history  (in  which)  numerical  seasonal  predictions  were 
on  par  with  empirical  methods.” 

The  CFS  belongs  to  a  class  of  numerical  models  known  as  general 
circulation  models  (GCMs).  Many  GCMs  were  built  to  focus  on  global  climate 
issues;  therefore,  they  struggle  when  applied  regionally  at  intraseasonal  time 
scales.  Coarse  resolution,  limited  parameterizations,  and  systematic  model 
errors  all  translate  into  limited  operational  use  of  many  of  the  GCMs.  However, 
one  advantage  of  a  GCM  vice  a  purely  statistical  method  is  the  ability  of  the 
numerical  model  to  explicitly  account  for  nonlinear  processes  (van  den  Dool 
2007).  The  reader  is  directed  to  van  den  Dool  (2007)  for  an  informative 
discussion  of  the  relative  performance  of  GCMs  compared  to  empirical  methods. 
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c.  Combined 

The  wide  use  of  statistical  techniques  in  short-term  climate 
prediction  leads  one  to  the  question  whether  there  is  any  benefit  to  using  a  GCM 
or  combined  statistical-dynamical  approach,  or  whether  a  pure  statistical  forecast 
would  perform  just  as  well.  A  combined  methodology  is  potentially  superior, 
since  such  an  approach  has  the  ability  to  incorporate  the  advantages  of  each 
approach.  The  method  used  in  this  thesis  entails  the  use  of  a  GCM  to  develop  a 
prediction  of  the  large  sale  environmental  factors  (LSEFs)  that  affect  TC 
formations,  and  then  uses  these  LSEFs  to  force  a  statistical  model  that  has  been 
trained  over  many  years  of  TC  and  LSEF  data  to  predict  the  probability  of  TC 
formation  based  on  the  GCM  predictions  of  the  LSEFs. 

C.  EXISTING  PRODUCTS 

1.  Seasonal 

Seasonal  predictions  of  TC  activity  forecast  the  overall  character  for  an 
entire  TC  season  within  an  entire  basin  (e.g.,  the  total  number  of  TCs  in  a  WNP 
TC  season).  The  lead  times  for  seasonal  predictions  are  approximately  zero  to 
six  months.  Among  the  earliest  seasonal  tropical  cyclone  predictions  were  those 
produced  at  Colorado  State  University  in  the  1980s  for  the  Atlantic  basin. 
Prediction  techniques  have  continued  to  develop  and  expand  since  these  early 
forecasts,  and  now  include  other  ocean  basins  (Camargo  2006).  Though 
seasonal  prediction  is  not  the  focus  of  this  thesis,  these  existing  products  are 
briefly  mentioned  here  as  they  provide  much  of  the  framework  from  which  the 
newer  intraseasonal  products  are  derived.  Seasonal  TC  forecast  products  for 
the  WNP  are  generated  at  various  centers  using  statistical  and  dynamical 
methods.  The  following  listing  of  centers  and  products  is  by  no  means  all- 
inclusive,  but  provides  a  glimpse  into  the  spectrum  of  participants  and 
approaches. 
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a. 


Statistical 


The  City  University  of  Hong  Kong  has  issued  seasonal  forecasts  for 
the  number  of  storms  in  the  WNP  basin  since  1997.  They  use  several 
environmental  conditions,  the  most  prominent  of  which  are  El  Nino  and  the 
Pacific  subtropical  ridge,  in  order  to  forecast  the  number  of  TCs  (Chan  et  al. 
2001).  Tropical  Storm  Risk,  a  consortium  out  of  the  United  Kingdom,  also  issues 
statistical  forecasts  for  TC  activity  in  the  WNP.  In  addition,  they  generate  a 
forecast  of  the  NW  Pacific  accumulated  cyclone  energy  (ACE)  index,  based  in 
large  part  on  conditions  in  the  Nino  3.75  region  (Lloyd-Hughes  et  al.  2004). 

b.  Dynamical 

The  European  Centre  for  Medium-range  Weather  Forecasts 
(ECMWF)  issues  seasonal  forecasts  of  TC  activity  based  on  coupled  ocean- 
atmosphere  models  (Vitart  and  Stockdale  2001).  The  International  Research 
Institute  for  Climate  and  Society  (IRI)  also  generates  seasonal  forecasts  of  TC 
frequency,  but  uses  a  “two-tier”  approach.  The  first  step,  or  tier,  entails 
employing  various  statistical  and  dynamical  models  to  forecast  future  sea  surface 
temperature  (SST)  conditions.  Then,  the  predicted  SSTs  are  used  to  force 
numerical  atmospheric  models.  Detection  algorithms  are  then  used  to  identify 
TC-like  features  from  amidst  the  coarse-resolution  output  fields  (Camargo  and 
Zebiak  2002). 

2.  Intraseasonal 

Intraseasonal  predictions  of  TC  activity  forecast  the  activity  for 
intraseasonal  periods  (e.g.,  two  weeks  to  two  months)  within  an  ocean  basin.  TC 
prediction  at  intraseasonal  time  scales  is  a  comparatively  new  area  of  research, 
which  may  be  attributed  to  increasing  model  resolution,  improving  ensemble 
techniques,  and  continuing  research  into  intraseasonal  climate  oscillations. 
Many  of  the  centers  noted  in  the  seasonal  section  are  active  in  the  intraseasonal 
realm  as  well. 
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a. 


Non-DoD  Products 


On  the  intraseasonal  time  scale,  the  Madden-Julian  oscillation 
(MJO)  presents  the  greatest  predictive  potential  for  empirical  approaches.  Useful 
predictive  skill  for  statistical  methods  are  on  the  order  of  15  to  20  days,  limited  by 
the  signal-to-noise  ratio  of  the  MJO  (Camargo  2006).  Frank  and  Roundy  (2006) 
look  beyond  MJO  alone  and  generate  daily  probabilities  of  formation  using  a 
wide  variety  of  wave  modes  and  climate  signals.  More  recently,  Leroy  and 
Wheeler  (2008)  used  logistic  regression  in  a  purely  statistical  prediction  scheme 
to  predict  the  probability  of  TC  formation  in  fixed  zones  of  the  Southern 
Hemisphere.  Their  predictors  include  one  representing  a  smoothed 
climatological  cycle,  two  representing  the  propagation  of  MJO,  and  two 
representing  the  leading  patterns  of  SST  variability. 

Despite  the  promise  of  the  budding  statistical  methods,  Camargo 
(2006)  contends  that  “while  there  is  much  room  for  improvement  in  the  skill  and 
application  of  empirical/statistical  methods  of  intra-seasonal  TC  prediction,  the 
greatest  hope  for  improvement  lies  with  dynamical/numerical  models.”  One  of 
the  most  promising  players  in  the  dynamical  field  is  the  ECMWF  and  their 
Ensemble  Prediction  System  (EPS). 

Only  a  few  centers  create  subseasonal  forecasts,  even  fewer  do  so 
operationally  and  make  the  forecasts  available  freely  online.  The  CPC  is  among 
this  select  group,  with  its  operational  Global  Tropics  Benefits/Hazards 
Assessment  product. 
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Figure  2.  Example  CPC  Global  Tropics  Benefits/Hazards  Assessment, 
issued  by  CPC/NCEP  on  6  August  2007  and  valid  14-20  August  2007 
(From  http://www.cpc.noaa.gov/products/precip/CWIink/ghazards/; 
accessed  12  January  2009). 


Figure  2  is  an  example  of  the  Global  Tropics  Benefits/Hazards 
Assessment  issued  by  CPC.  This  product  has  both  the  graphical  depiction  (as 
shown  in  Figure  2),  as  well  as  accompanying  text  that  explains  the  assessment. 
The  description  for  the  highlighted  area  in  the  WNP  labeled  region  “4”  states 
(From  http://www.cpc.noaa.gov/products/precip/CWIink/ghazards/;  accessed  12 
January  2009): 

The  potential  for  tropical  cyclone  development  northeast  of  the 
Philippines  and  in  the  South  China  Sea.  Active  convection  is 
expected  in  this  area  and  with  areas  of  anticipated  weak  vertical 
wind  shear  and  above  average  SSTs  the  prospects  for  tropical 
cyclogenesis  are  increased.  Confidence:  Moderate. 

Though  this  product  makes  strides  with  providing  outlooks  for 
impacts  on  TC  activity  due  to  the  forecasted  state  of  the  tropical  climate  system, 
this  product  is  limited  by  its  subjective  combination  of  forecast  tools.  Plans  for 
this  product  include  making  it  more  objective  in  nature  (Gottschalck  et  al.  2008). 
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b. 


DoD  Products 


As  of  this  writing,  no  DoD  centers  are  actively  issuing  forecasts  at 
seasonal  or  intraseasonal  leads  for  the  tropics.  The  Joint  Typhoon  Warning 
Center  (JTWC)  is  the  DoD  agency  responsible  for  issuing  tropical  cyclone 
warnings  for  the  Indian  and  Pacific  Oceans.  Products  produced  by  JTWC  are 
intended  for  use  in  decision  making  by  operational  military  units,  though  most  of 
these  products  are  nowcasts  and/or  short-term  forecasts. 

The  Fleet  Numerical  Meteorological  and  Oceanographic 
Detachment  -  Asheville  (FNMOD)  is  another  logical  place  for  operators/planners 
to  turn  for  information  regarding  future  tropical  activity.  FNMOD  does  maintain  a 
Mariners’  Worldwide  Climate  Guide  to  Tropical  Storms  at  Sea,  which  appears  to 
be  a  form  of  climatology  for  each  basin  broken  down  into  10-15  day  periods 
(depending  on  the  time  of  year).  This  guide  is  certainly  better  than  having 
nothing  at  all,  but  contains  no  information  about  the  current  or  forecasted  state  of 
the  climate  system. 

Collocated  with  FNMOD,  is  the  Air  Force’s  14th  Weather  Squadron 
(14WS;  formerly  known  as  the  Air  Force  Combat  Climatology  Center  (AFCCC)). 
While  the  14WS  recently  began  issuing  long-range  forecasts  for  select  locations 
(i.e.,  Iraq,  Afghanistan),  no  products  concerning  the  current  or  forecasted  state  of 
the  tropical  climate  system  in  general,  or  TCs  in  particular,  are  available. 

D.  RESEARCH  MOTIVATION  AND  SCOPE 
1.  Prior  Work 

The  idea  for  taking  a  combined  statistical-dynamical  approach  for 
predicting  likely  cyclogenetic  regions  in  the  tropics  evolved  from  thesis  work  by 
Meyer  (2007).  Though  his  study  did  not  venture  into  the  realm  of  forecasting, 
Meyer  used  logistic  regression  to  calculate  the  probability  of  TC  formation  in 
weekly  five-degree  latitude  by  five-degree  longitude  grid  blocks  as  a  way  of 
quantifying  the  impacts  of  changes  in  the  large-scale  environment  on  the 
likelihood  of  TC  formation. 
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TC  Formation  Probability  for  2006  Week  26  Thresholds  not  considered 


Figure  3.  Example  figure  generated  using  methods  from  Meyer  (2007), 
contoured  are  the  zero-lead  hindcast  probabilities  of  TC  formation  for 

week  26  of  2006,  and  the  contours  are  at  0.01 , 0.25,  0.40,  and  0.55.  The 
red  dot  represents  a  verifying  TC  formation  location. 

Figure  3  is  an  example  plot  after  Meyer’s  work.  Such  plots  resulted  in  a 
perceived  forecast  potential  and  a  methodological  basis  for  this  thesis  work. 

2.  Research  Questions 

This  thesis  is  an  exploration  into  the  viability  of  the  prescribed 
methodology  as  a  predictive  tool  at  intraseasonal  time  scales.  While  many  sub¬ 
questions  exist,  this  work  will  primarily  focus  on  investigating  the  following  two 
questions: 

1)  Can  favorable  regions  for  tropical  cyclogenesis  be  predicted  at 
intraseasonal  lead  times,  by  way  of  forcing  a  statistical  model  with  available 
output  from  a  GCM? 

2)  Does  a  combined  statistical-dynamical  approach  appear  to  result  in 
skill  and  value  beyond  that  which  basic  climatology  provides? 
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As  one  can  deduce  from  the  preceding  questions,  this  work  will 
concentrate  on  methodology  and  stand  as  a  “proof  of  concept.”  This  thesis  is  not 
an  attempt  to  advance  the  science  of  tropical  dynamics;  however,  it  may 
indirectly  contribute  to  an  improved  understanding  of,  and  ability  to  model,  the 
large  scale  environmental  factors  that  affect  TC  activity. 

3.  Thesis  Organization 

In  order  to  answer  the  two  aforementioned  research  questions,  this  work 
will  focus  on  the  following  steps  of  the  climate  prediction  process  (see  Figure  1): 
Data  Selection,  Climate  System  Analysis,  Forecast  Model  Development, 
Hindcast/Forecast,  and  Verification/Evaluation. 

Chapter  II  begins  by  defining  the  study  region  and  time  period,  and  then 
provides  a  brief  look  at  the  numerous  data  sets  used  in  this  study,  as  well  as  the 
methods  used  in  developing  and  testing  our  predictive  model.  Also  included  in 
Chapter  II  is  a  summary  of  the  large-scale  variables  thought  to  impact  TC 
formation.  Chapter  III  outlines  the  results  of  the  model  development  and 
hindcasting;  in  addition,  Chapter  III  demonstrates  the  predictive  potential  of  the 
model  by  way  of  a  pair  of  case  studies.  Chapter  IV  provides  a  summary  of  our 
results  and  conclusions,  and  offers  suggestions  for  future  research. 

To  make  this  thesis  purposefully  concise,  several  topics  are  only 
mentioned  briefly  in  the  text  but  covered  more  at  length  in  the  appendices.  The 
following  topics  are  appended  to  this  work  as  references  for  the  reader: 
climatology  development  and  selection  (Appendix  A),  calculation  of  variables 
from  available  output  fields  (Appendix  B),  and  plots  from  additional  case  studies 
(Appendix  C). 
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II.  DATA  AND  METHODS 


A.  STUDY  REGION 

The  western  North  Pacific  (WNP)  was  chosen  as  the  focus  region  for  this 
study.  Our  analyses  of  JTWC  best  track  TC  data  from  1970-2007  indicate  that 
an  average  of  30  TCs — tropical  depressions  through  super  typhoons — form  per 
year,  with  a  standard  deviation  of  4.8  storms.  With  that,  the  WNP  has  the 
highest  average  number  of  TCs  annually  of  all  basins,  and  accounts  for  nearly 
30%  of  global  annual  total  TCs  (Chan  2004).  The  WNP  is  also  the  only  ocean 
basin  wherein  TC  formation  is  observed  throughout  the  year,  although  the 
majority  of  cyclones  develop  between  June  and  November  (Frank  1987).  This 
study  region  was  also  chosen  for  its  economic  and  military  importance. 


TC  Formation  Points:  1970  -  2007 


90E  120E  150E  180  150W  120W 

Figure  4.  Depiction  of  the  WNP  study  region  (outlined  by  the  blue  box)  and 
TC  formation  points  (red  dots),  constructed  from  JTWC  WNP  best  track 

data  from  years  1970-2007. 

The  study  region  extends  from  100°E  to  190°E  (170°W)  and  from  the 
Equator  to  30°N,  as  depicted  in  Figure  4.  No  literature  standard  exists  for 
defining  the  WNP  basin;  however,  the  bounds  for  our  study  region  differ  no  more 
than  10°  in  any  one  direction  from  the  majority  of  other  sources.  One  reason  that 
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our  bounds  differ  slightly  from  other  studies  that  deal  with  the  WNP,  is  that  in 
focusing  on  the  genesis  locations  there  is  no  need  to  allow  for  the  recurvature  of 
TCs  post-formation.  Defining  restrictive  bounds  also  minimize  the  potential 
impacts  of  data  dilution  in  our  statistical  verification. 
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Figure  5.  Number  of  TC  formations  versus  day  of  year,  constructed  with 
JTWC  WNP  best  track  data  from  years  1970-2007. 

As  noted  earlier,  TC  formations  are  observed  throughout  the  year  in  the 
WNP.  Figure  5  displays  the  variation  in  the  number  of  TC  formations  in  the  WNP 
during  the  period  for  a  given  Julian  day.  This  figure  also  highlights  the  unequal 
distribution  of  formations  over  the  course  of  the  year.  If  one  defines  the  peak 
formation  period  as  June  through  November  (as  in  Frank  1987),  those  months 
account  for  936  of  the  1122,  or  83%,  of  the  TCs;  in  contrast,  a  peak  formation 
period  of  July  through  October  (as  in  Sobel  and  Camargo  2005)  accounts  for  761 
of  the  1 122,  or  68%,  of  TC  formations.  Hereafter  in  this  study  the  peak  formation 
season  will  be  defined  as  a  period  encompassing  the  months  June  through 
November. 


Day  of  TC  Formation 
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Day  of  Year 
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Just  as  the  temporal  distribution  of  TC  formations  is  not  uniform 
throughout  the  year,  the  distribution  varies  spatially  over  the  extent  of  the  study 
region.  Figure  6  highlights  the  spatial  variability  from  grid  point  to  grid  point. 


Raw  2.5  Degree  Formation  Climatology;  1970-2007 


Figure  6.  Contoured  probability  of  TC  formation,  constructed  from  binned 
JTWC  best  track  data  from  the  years  1970  -  2007.  Values  represent  the 
probability  that  a  TC  will  form  per  year  in  a  given  grid  box. 

Figure  6  shows  what  we  term  the  “Raw  2.5  Degree  Formation 
Climatology”  that  was  constructed  by  summing  the  number  of  TC  formations  in 
the  JTWC  best  track  data  for  1970-2007  within  2.5°  latitude/longitude  grid  boxes, 
and  then  dividing  the  total  per  box  by  the  number  of  years.  The  result  is  a  map  of 
the  climatological,  or  long  term  mean,  probability  of  TC  formation  during  January- 
December.  The  probabilities  are  based  on  TC  formation  over  the  course  of  the 
entire  year,  so  the  contour  values  are  not  overly  useful  to  most  decision  makers. 
See  Appendix  A  for  further  discussion  on  TC  formation  climatology. 

In  this  study,  we  attempted  to  develop  and  test  a  statistical-dynamical 
method  for  forecasting  TC  formation  probabilities  that  is  more  skillful  than  the 
forecasts  that  could  be  obtained  by  simply  using  climatological  TC  formation 
probabilities  (e.g.,  those  shown  in  Figure  6  and  discussed  in  Appendix  A).  For 
such  a  method  to  be  more  skillful  than  climatology,  it  needs  to  be  skillful  at 
representing  climate  anomalies  in  the  large-scale  environment  that  affect  TC 
formations.  Thus,  it  is  useful  to  review  the  general  conditions  that  influence  TC 
formations  in  the  WNP  during  the  peak  formation  season.  Figure  7  shows  the 
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main  low-level  circulation  features  that  characterize  summer  Conditions  in  the 
WNP.  Note  in  this  figure  the  band  of  convergent  and  cyclonic  flow  marked  by  the 
dashed  line.  This  band  is  often  labeled  the  monsoon  trough  (Ramage  1995),  so 
named  because  of  its  association  with  summertime  monsoonal  flow  in  the  region. 


Figure  7.  Schematic  depiction  of  summertime  low-level  flow  over  the  WNP. 

The  dashed  line  marks  the  monsoon  trough  and  the  zig-zag  line  indicates 
the  mean  ridge  axis  (From:  Figure  1  (a)  Lander  1996). 

The  monsoon  trough  is  associated  with  the  development  of  most  TCs  in 
the  WNP  (Xue  and  Neuman  1984),  due  to  the  predominantly  favorable 
environmental  factors  (as  described  in  Section  II. C.).  This  is  also  indicated  by 
the  co-location  of  the  high  probabilities  in  Figure  6  and  the  climatological  position 
of  the  monsoon  trough  in  Figure  7.  The  position  of  the  monsoon  trough 
experiences  normal  seasonal  variations  through  the  year,  as  well  as  spatial  and 
temporal  deviations  from  its  normal  seasonal  cycle.  One  example  of  a  significant 
deviation  is  labeled  a  reverse-oriented  monsoon  trough,  when  the  convergence 
zone  extends  from  southwest  (SW)  to  northeast  (NE)  over  the  WNP  (Chu  2004). 

B.  DATA  SETS  AND  SOURCES 

The  structure,  format,  and  availability  of  the  primary  data  sets  used  in  this 
thesis  are  described  in  this  section.  The  inquisitive  reader  is  directed  to  the  cited 
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references  for  more  information  on  each  of  these  data  sources.  All  of  the  data 
used  in  this  thesis  are  freely  and  publically  available  online. 

1.  JTWC  Best  Track 

The  JTWC  maintains  an  archive  of  tropical  cyclone  data  for  the  WNP.  At 
a  minimum,  these  “best  track”  files  contain  the  latitude  and  longitude  of  the  TC 
center  location  at  six-hour  intervals.  These  data  are  used  for  both  model 
construction  and  verification  in  this  study. 

The  best  track  archive  includes  all  TCs  identified  by  the  JTWC,  and  even 
includes  a  number  of  storms  that  were  determined  to  be  of  sufficient  strength  for 
classification  as  TCs  well  after  the  storms  occurred.  The  aforementioned  TC 
numbers  in  Section  II. A.  are  higher  than  those  in  some  prior  studies  that 
analyzed  only  storms  that  were  of  tropical  storm  intensity  or  greater. 

The  JTWC  data  set  is  not  without  controversy.  Several  researchers  have 
noted  that  variations  in  analysis  procedures,  as  well  as  changes  in  observational 
tools  (satellite,  aircraft,  etc.)  over  the  years,  may  compromise  the  overall 
consistency  of  the  best-track  [as  written  can  be  confusing]  records.  Furthermore, 
Wu  et  at.  (2006)  cite  notable  differences  in  the  track  information  from  JTWC  vice 
what  is  available  from  the  Regional  Specialized  Meteorological  Centre  Tokyo; 
among  the  reasons  for  the  discrepancies  are  differences  in  the  time  period  over 
which  winds  were  averaged,  and  differences  in  each  center’s  intensity-estimation 
techniques.  However,  efforts  have  been  made,  and  are  continuing,  to  minimize 
the  discrepancies  within  the  JTWC  best  track  archive  and  between  the  JTWC 
archive  and  other  sources  for  historical  TC  information  (Chu  et  at.  2002).  We 
determined  that  the  potential  problems  with  the  JTWC  best  track  data  were  not 
likely  to  significantly  influence  our  study  results. 

2.  NCEP  Reanalyses 

Global  objective  analyses  that  assimilate  numerous  observational  data 
sources  with  model  output  and  span  many  years  provide  an  increased  ability  to 
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investigate  the  physical  processes  that  surround  TC  development.  Prior  to  the 
introduction  of  these  so-called  reanalyses,  it  was  difficult  to  consistently 
investigate  subtle  variations  in  the  climate  system.  We  used  two  reanalysis  data 
sets:  (1)  the  NCEP/National  Center  for  Atmospheric  Research  (NCAR) 
Reanalysis  Projects  (Kalnay  et  al.  1996;  Kistler  et  at.  2001);  and  (2)  the 
NCEP/Department  of  Energy  (DOE)  Atmospheric  Model  Intercomparison 
Project-ll  (AMIP-II)  Reanalysis  (Kanamitsu  et  al.  2002). 

The  NCEP/NCAR  Reanalysis  Projects  data  set  (hereafter  referred  to  as 
R1),  and  the  NCEP/DOE  AMIP-II  Reanalysis  data  set  (hereafter  referred  to  as 
R2)  are  both  based  on  assimilating  data  using  a  fixed  model  at  T62L28 
resolution.  Though  both  reanalyses  use  the  same  raw  observational  data,  the 
R2  project  attempts  to  correct  some  of  the  known  errors  in  the  R1  reanalysis; 
please  review  the  cited  publications  for  more  details. 

Though  other  variables  were  tested,  the  final  atmospheric  variables  used 
in  the  construction  of  our  regression  model  (see  Section  III.A.)  are  all  manually 
derived  from  “A”  variables.  Kalnay  et  al.  (1996)  note  that  an  “A  indicates  that  the 
analysis  variable  is  strongly  influenced  by  observed  data  and,  hence,  it  is  in  the 
most  reliable  class.” 

For  the  purposes  of  this  study,  we  used  daily  mean  fields  interpolated  to  a 
2.5°  global  grid.  R2  was  the  primary  dataset  from  which  variables  were  derived, 
but  R1  data  was  used  in  this  research  for  verification  and  as  a  way  to  test  the 
model’s  sensitivity  to  a  specific  reanalysis  system. 

3.  NOAA  OISST 

Just  as  the  atmospheric  reanalyses  are  invaluable  tools  in  developing 
empirical  prediction  methods,  so  too  is  a  quality  database  of  SSTs.  The  SST 
data  used  in  developing  our  statistical  model  is  the  National  Oceanic  and 
Atmospheric  Administration  (NOAA)  optimum  interpolation  (01)  SST  analysis 
version  2.  SST  values  from  this  dataset  are  available  in  weekly  means  from 
1981  to  present,  at  one  degree  latitude  by  one  degree  longitude  horizontal 
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resolution  on  a  global  grid.  OISST  data  combine  in  situ  and  satellite-derived  SST 
measurements  with  biases  adjustments  (Reynolds  et  al.  2002). 

In  order  to  match  our  R1  and  R2  data,  the  OISST  fields  were  extrapolated 
from  one  degree  resolution  to  2.5°  horizontal  resolution  and  interpolated  to  daily 
values. 

4.  NCEP  CFS 

The  NCEP  Climate  Forecast  System  (CFS)  is  a  one-tier  fully  coupled 
ocean-land-atmosphere  dynamical  assimilation  and  prediction  system,  which  has 
been  operational  at  NCEP  since  August  2004  (Saha  et  al.  2006).  The 
atmospheric  component  of  this  coupled  system  is  a  reduced-resolution  version  of 
the  more-familiar  2003  operational  NCEP  Global  Forecast  System  (GFS),  with 
T62L64  resolution  (equivalent  to  -200  km  Gaussian  grid);  the  initial  conditions 
are  obtained  from  the  operational  R2  (Saha  2008).  This  atmospheric  component 
is  coupled  once  per  day,  without  flux  correction,  with  the  Geophysical  Fluid 
Dynamics  Laboratory  (GFDL)  Modular  Ocean  Model  version  3  (MOM3).  Four 
CFS  runs  are  executed  daily,  with  integrations  out  to  nine  months.  Of  the  two 
runs  initialized  at  00Z  and  at  12Z,  each  has  the  same  initial  oceanic  state,  but  a 
slightly  perturbed  atmospheric  state.  The  initial  conditions  for  these  runs  are  one 
day  old  for  both  the  atmospheric  and  oceanic  variables  (Saha  2008). 

One  appealing  feature  of  the  CFS  is  the  availability  of  hindcast  and  bias 
correction  fields.  As  noted  in  Section  I.B.3.b.,  GCMs  are  often  plagued  by 
systematic  errors.  We  are  able  to  remove  much  of  this  systematic  error,  namely 
climate  drift,  by  employing  the  forecast  climatology  that  is  available  for  all 
forecast  lead  times  and  the  daily  observed  climatology.  Such  corrective  fields 
are  only  available  for  a  subset  of  variables. 

From  the  available  fields,  in  gridded  binary  format,  we  manually  extracted 
daily  SSTs  at  one-degree  global  coverage,  daily  atmospheric  variables  converted 
from  their  native  Gaussian  grid  to  a  2.5°  latitude/longitude  grid,  and  the 
appropriate  bias  correction  fields.  Once  the  variables  were  bias  corrected  and  on 
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the  same  latitude/longitude  grid,  we  used  the  SSTs  and  atmospheric  variables  to 
force  a  statistical  model  to  provide  a  probability  of  TC  formation  at  every  grid 
point. 

With  numerous  GCMs  being  used  in  climate  science,  one  may  wonder 
why  we  chose  the  CFS.  In  addition  to  being  freely  and  publically  available,  the 
CFS  is  the  first  operational,  dynamical  model  with  predictive  skill  on  par  with 
statistical  methods  used  at  CPC  (Saha  et  al.  2006).  Saha  et  al.  (2006)  also 
notes  that  the  “Nino-3. 4  SST  is  probably  the  single  most  predictable  entity  [within 
the  CFS].”  Though  the  Nino-3.4  region  is  just  outside  of  our  WNP  study  region, 
we  were  motivated  by  the  relative  high  skill  of  the  CFS  in  the  Pacific  basin, 
especially  since  prior  studies  have  shown  that  SST  variability  in  the  Nino-3.4 
region  is  closely  related  to  variations  in  the  large  scale  environmental  factors  that 
influence  TC  formations  in  the  WNP  (Ford  2000;  Chan  2004).  In  addition  to  the 
perceived  skill,  the  CFS  also  offers  a  rudimentary  ensemble  construct.  With  two 
runs  executed  twice  daily,  the  potential  exists  for  a  four-run  ensemble  with 
perturbed  initial  conditions.  One  could  increase  the  number  of  ensemble 
members  by  incorporating  runs  from  other  days  as  well.  The  intention  for  the 
ensemble  approach  is  to  smooth  out  the  differences  between  the  runs  in  order  to 
bring  out  the  more  predictable  elements  and,  thereby,  lead  to  enhanced 
predictive  skill  on  average. 

C.  VARIABLES  OF  INTEREST 

The  existence  of  a  set  of  large-scale  environmental  factors  (LSEFs)  that 
are  influential  in  TC  formation  has  been  well  documented  over  the  last  half 
century.  Gray  (1968,  1975,  1979)  outlined  a  physical  climatology  of  tropical 
cyclogenesis  relative  to  six,  so-called  genesis  parameters.  Other  authors  vary 
the  list  of  these  parameters,  or  factors,  slightly  and  at  times  condense  the  list 
(e.g.,  low-level  relative  vorticity  and  the  Coriolis  parameter  may  be  combined  into 
a  single  absolute  vorticity  term).  Regardless  of  the  specific  list  of  LSEFs  one 
chooses,  the  physical  properties  are  arguably  quite  similar. 
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The  LSEFs  may  each  be  necessary  for  tropical  cyclogenesis,  but  a 
combination  of  these  parameters  alone  may  not  be  sufficient  to  diagnose  or 
predict  the  transition  from  a  convective  disturbance  into  an  organized  TC  (Frank 
1987).  This  idea  of  “necessary  but  not  sufficient”  suggests  that  the  large-scale 
environment  may  not  be  solely  responsible  for  determining  whether  a  TC  forms 
or  not.  Frank  (2006)  contends  that  “individual  storms  form  infrequently  and 
sporadically  within  large  areas  of  favorable  environmental  conditions  due  to  the 
effects  of  local  flow  perturbations.”  Such  a  mesoscale  trigger  and/or  perturbation 
in  the  local  flow  may  be  required  to  instigate  tropical  cyclogenesis,  but  abundant 
research  supports  the  profound  role  of  large-scale  external  forcing  as  a 
determining  factor  in  tropical  cyclogenesis  (Briegel  and  Frank  1997). 

An  underlying  goal  of  this  study  is  to  predict  favorable  regions  for  tropical 
cyclogenesis  at  intraseasonal  lead  times,  by  way  of  forcing  a  statistical  model 
with  available  output  from  a  GCM.  In  our  case,  we  use  the  NCEP  CFS  as  our 
dynamical  GCM,  from  which  not  all  the  LSEFs  are  available.  To  remedy  this,  we 
had  to  accomplish  two  tasks.  First,  we  had  to  consider  other  parameters  that  are 
similar  to  the  LSEFs  described  in  the  literature  and  may  represent  the  same 
processes,  and  for  which  intraseasonal  forecasts  are  readily  available  from  the 
CFS.  Second,  we  had  to  calculate  additional  variables  based  on  available  model 
output  fields.  For  variables  requiring  spatial  derivatives,  we  employed  second 
order  centered  finite  differencing;  see  Appendix  B  for  more  information  regarding 
the  calculation  of  additional  variables  from  available  fields. 

The  genesis  parameters,  as  proposed  by  Gray  (1975),  can  be  subdivided 
into  thermodynamic  and  dynamic  parameters.  What  follows  is  a  brief  look  at 
these  parameters,  as  well  as  some  of  the  additional  variables  we  considered. 
For  the  sake  of  brevity,  not  all  of  the  variables  that  we  investigated  in  our 
research  are  presented  here.  For  more  information  on  LSEFs  and  how  they 
relate  to  TC  development,  the  reader  is  directed  to  any  of  the  plethora  of  articles 
and  books  on  the  subject  (e.g.,  Gray  1975;  Frank  1987). 
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1.  Thermodynamic  Parameters 

Research  suggests  that  sufficiently  high  SSTs  and  moisture  in  the  mid¬ 
troposphere  are  important  for  only  for  TC  formation,  but  also  for  tropical  deep 
convection.  These  thermodynamic  variables  are  often  favorable  for  TC 
development  over  much  of  the  tropical  Pacific  during  much  of  the  year  (see 
Chapter  III). 


a.  Sea  Surface  Temperatures 

Frank  (1987)  contends  that  the  high  frequency  of  TC  formation  in 
the  WNP,  as  compared  to  other  ocean  basins,  is  due,  in  part,  to  an  expansive 
area  of  warm  water  (e.g.,  water  warmer  than  26°C). 


Figure  8.  Average  June  -  November  SST  (in  °C)  conditions  over  the  WNP  for 
the  period  1982-2000,  plotted  from  NOAA  OISST  data  interpolated  to  2.5° 

horizontal  resolution. 


Figure  8  depicts  the  average  SST  conditions  over  the  WNP  during 
the  peak  formation  season.  Such  a  large  expanse  of  warm  water  has  led  some 
researchers  to  conclude  that  SSTs  may  not  be  a  primary  factor  affecting 
formation  in  the  tropical  Pacific  (Chan  2004),  as  the  temperatures  are  often 
sufficiently  warm  (e.g.,  greater  than  26°C). 
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Sea  Surface  Temperature  Boxplot 


Figure  9.  Box  plots  of  grid  values  of  SST  (in  °C)  grouped  according  to 
whether  a  TC  formed  in  that  day-grid  box  (indicated  by  Yes  on  the 
horizontal  axis)  or  did  not  form  in  that  box  (indicated  by  No).  The  box 
height  encompasses  50%  of  the  SST  data  points,  the  whiskers  (dashed 
lines)  extend  to  include  -99%  of  the  SST  data  points,  and  SST  data  that 
fell  outside  the  whiskers  (outliers)  are  indicated  by  the  red  “+”  symbols. 

Constructed  from  NOAA  OISST  data  with  TC  occurrences  from  the  JTWC 
best  track  archive  for  the  January-December  period  of  1982-2006. 


The  Box  plots  in  Figure  9  separate  the  SSTs  at  2.5°  latitude  x  2.5° 
longitude  by  day  grid  blocks  according  to  whether  a  TC  formed  in  the  grid  block 
(“Yes”)  or  not  (“No”).  The  comparatively  constrained  appearance  of  the  “Yes” 
boxplot  indicates  that  TCs  seem  to  form  in  conjunction  with  a  small  range  of 
SSTs  in  the  upper  20s  and  low  30s  degrees  Celsius.  Numerous  sources,  such 
as  Frank  (2006)  and  Meyer  (2007),  note  that  SSTs  must  meet  or  exceed  26.5°C 
to  favorably  support  TC  formation.  These  Box  plots  support  that,  and  suggest 
that  a  threshold  for  the  WNP  may  be  even  more  restrictive  (i.e.,  >  28°C). 
Physical  reasoning  and  the  sort  of  relationships  shown  in  Figure  9  indicate  that 
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SST  and  TC  formation  probability  at  a  given  location  should  be  directly  and 
positively  related  to  each  other,  if  all  other  factors  that  influence  TC  formation  are 
favorable  and  held  constant. 

b.  Humidity 

Early  research  on  the  climatologies  of  WNP  TCs  indicates  that  TCs 
only  form  in  regions  where  seasonally  averaged  values  of  mid-tropospheric 
moisture  are  high.  The  physical  explanation  is  that  moist  air  in  the  middle 
troposphere  is  more  conducive  to  deep  convection  and  vertical  coupling  of  the 
atmosphere  (Gray  1975). 


fl)  June  ■  November  LTM  500mb  Relative  Humidity  (%) 
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Figure  10.  Average  June  -  November  a)  500mb  relative  humidity  (in  %)  and  b) 
precipitable  water  (in  kg  m'2)  conditions  over  the  WNP  for  the  period 
1971-2000,  plotted  from  R1  data. 

Mid-tropospheric  humidity  variables  are  not  available  from  the  CFS. 
So  we  looked  to  precipitable  water  as  a  viable  alternative  to  represent  the 
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available  environmental  moisture.  Figure  10  shows  average  conditions  during 
the  peak  formation  period  of  mid-level  relative  humidity  and  total-column 
precipitable  water.  Though  the  units  are  not  directly  comparable,  one  should 
note  the  spatial  agreement  of  the  location  of  high  humidities  to  the  location  of 
greatest  precipitable  water.  As  with  SST,  physical  reasoning  indicates  that,  all 
other  factors  being  favorable  and  constant,  an  increase  in  atmospheric  moisture 
content  should  lead  to  an  increase  in  the  probability  of  TC  formation.  We 
confirmed  this  with  moisture-TC  formation  Box  plots  (not  shown)  similar  to  those 
in  Figure  9. 

2.  Dynamic  Parameters 

As  noted  earlier,  favorable  thermodynamic  conditions  are  often  present 
over  expansive  swaths  of  the  WNP  much  of  the  year;  therefore,  dynamic 
parameters  are  thought  to  be  responsible  for  determining  whether  a  TC  will  form 
in  a  region  that  is  thermodynamically  favorable  for  TC  formation.  Gray  (1975) 
noted  the  comparatively  small  spatial  and  temporal  scales  over  which  a 
disturbance  will  interact  with  its  surrounding  dynamic  environment.  These  subtle 
interactions  at  smaller  scales  provide  the  motivation  to  use  data  at  2.5°  resolution 
and  daily  time  steps  for  this  study,  versus  the  previous  work  by  Meyer  (2007)  that 
used  5°  data  at  weekly  time  steps. 

a.  Shear 

Numerous  studies  have  found  that  large  values  of  vertical  wind 
shear  in  the  large-scale  environment  tend  to  suppress  TC  formations.  Though 
various  definitions  exist  in  literature,  the  most  common  measure  of  vertical  wind 
shear  is  the  mean  vector  wind  at  850  mb  subtracted  from  the  mean  vector  wind 
at  200  mb.  Such  a  calculation  results  in  a  magnitude  and  direction,  though  the 
magnitude  alone  is  used  in  this  work.  Near  the  monsoon  trough  axis,  vertical 
wind  shear  is  minimal,  allowing  deep  convection  to  be  sustained  and  increasing 
the  likelihood  of  TC  formation  in  the  region  (Chan  2004). 
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June  -  November  LTM  Shear  (m/s) 
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Figure  1 1 .  Average  June  -  November  magnitude  of  vertical  wind  shear  (in  m 
s'1)  over  the  WNP  for  the  period  1971-2000,  derived  from  R1  data. 

Figure  1 1  displays  the  mean  magnitude  of  vertical  wind  shear  over 
the  WNP  during  the  peak  formation  season.  The  reader  should  note  the  co- 
location  of  the  low  mean  shear  pattern  (Figure  11),  the  climatological  monsoon 
trough  (Figure  7),  and  the  highest  climatological  probabilities  of  formation  (Figure 
6).  The  Box  plots  in  Figure  12  solidify  the  relationship  between  the  magnitude  of 
vertical  wind  shear  and  probability  of  TC  formation — TCs  form  in  regions  of  low 
environmental  wind  shear. 


26 


Shear  Boxplot 


No  Yes 

TC  Occurrence 


Figure  12.  Box  plots  of  grid  values  of  the  magnitude  of  the  mean  vertical  wind 
shear  (in  m  s'1)  grouped  according  to  whether  a  TC  formed  in  that  day-grid 
box.  The  box  height  encompasses  50%  of  the  data  points,  the  whiskers 
(dashed  lines)  extend  to  include  -99%  of  the  data  points,  and  points  that 
fall  outside  the  whiskers  (outliers)  are  indicated  by  the  red  “+”  symbols. 
Constructed  from  R2  data  with  TC  occurrences  from  the  JTWC  best  track 
archive  for  the  January-December  period  of  1982-2006. 


b.  Upward  Vertical  MotionA/elocity 

During  the  peak  formation  season  in  the  WNP,  warm  waters  lie  just 
to  the  west  of  the  tropical  upper  tropospheric  trough  (TUTT)  and  near  the 
entrance  region  of  the  climatological  tropical  easterly  jet.  Both  features 
contribute  to  regions  of  upper-level  divergence  and/or  persistent  upward  vertical 
motion,  both  shown  to  be  favorable  for  cyclogenesis  (Frank  1987). 
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Figure  13.  Average  June  -  November  a)  500mb  omega  (in  Pa  s'1)  and  b) 
200mb  divergence  (in  s'1)  conditions  over  the  WNP  for  the  period  1 971— 

2000,  derived  from  R1  data. 

Just  as  with  the  moisture  variables,  the  availability  of  variables  from 
the  CFS  influenced  our  choice  of  the  variables  to  use  to  represent  vertical  motion 
in  our  statistical  model.  A  variable  directly  representing  vertical  motion  is  not 
readily  available  from  the  CFS  at  daily  time  steps,  thus  we  opted  to  test  200  mb 
divergence  (calculated  from  the  200  mb  zonal  and  meridional  wind  fields;  see 
Appendix  B  for  more  information  regarding  these  calculations).  Figure  13  depicts 
the  peak  season  averages  of  500  mb  omega  and  200  mb  divergence. 
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Variable  Sensitivity 


Figure  14.  Normalized  January-December  500  mb  omega  vs.  200  mb 
divergence  scatter  plot,  displaying  sensitivity  between  the  variables, 
constructed  from  R2  data  for  the  period  1982-2006. 

Though  the  spatial  patterns  in  Figure  13  suggest  that  upper-level 
divergence  may  be  a  suitable  alternative  to  the  more-traditional  omega,  we 
sought  to  test  the  sensitivity  of  these  two  variables.  Figure  14  is  a  scatter  plot  of 
normalized  divergence  versus  omega.  Knowing  the  opposing  sign  conventions, 
the  negative  slope  to  the  elongated  cluster  suggests  that  the  variables  are 
reasonably  correlated,  and  that  divergence  may  be  a  suitable  replacement  for 
omega.  The  box  plots  in  Figure  15  indicate  that  TCs  form  in  the  WNP  when  200 
mb  divergence  is  weak,  but  skewed  towards  divergent  outflow  aloft. 
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Figure  15.  Box  plots  of  grid  values  of  upper-level  divergence  (in  s'1)  grouped 
according  to  whether  a  TC  formed  in  that  day-grid  box.  The  box  height 
encompasses  50%  of  the  data  points,  the  whiskers  (dashed  lines)  extend 
to  include  -99%  of  the  data  points,  and  points  that  fall  outside  the 
whiskers  (outliers)  are  indicated  by  the  red  “+”  symbols.  Constructed  from 
R2  data  with  TC  occurrences  from  the  JTWC  best  track  archive  for  the 
January-December  period  of  1982-2006. 


c.  Vorticity 

The  final  genesis  parameter  is  vorticity  in  the  lower  troposphere. 
As  their  behavior  is  different,  we  chose  to  investigate  relative  vorticity  and 
planetary  vorticity — as  represented  by  the  Coriolis  parameter  f—  separately,  as 
well  as  combined  into  a  single  low-level  absolute  vorticity  term.  Frank  (1987) 
notes  that  relative  vorticity  may  result  from  several  sources,  including  from  the 
intensification  of  monsoon  trough  circulations,  waves  in  the  easterlies,  or  along 
frontal  zones  that  extend  into  the  tropics. 
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Figure  16.  Average  June  -  November  850  mb  relative  vorticity  (in  s'1) 
conditions  over  the  WNP  for  the  period  1971-2000,  derived  from  R1  data. 

The  spatial  pattern  of  Figure  16  should  be  familiar  to  the  reader  by 
this  point,  with  the  greatest  average  values  of  850  mb  relative  vorticity  in  spatial 
agreement  with  the  monsoon  trough  figure  described  earlier  in  this  chapter.  Of 
the  three  mechanisms  noted  by  Frank  that  intensify  relative  vorticity,  only  the 
monsoon  trough  is  persistent  enough  to  be  clearly  represented  in  this  six-month 
composite. 


31 


xio* 


850hPa  Relative  Vorticity  Boxplot 


No  Yes 

TC  Occurrence 


Figure  17.  Box  plots  of  grid  values  of  low-level  relative  vorticity  (in  s'1)  grouped 
according  to  whether  a  TC  formed  in  that  day-grid  box.  The  box  height 
encompasses  50%  of  the  data  points,  the  whiskers  (dashed  lines)  extend 
to  include  -99%  of  the  data  points,  and  points  that  fall  outside  the 
whiskers  (outliers)  are  indicated  by  the  red  “+”  symbols.  Constructed  from 
R2  data  with  TC  occurrences  from  the  JTWC  best  track  archive  for  the 
January-December  period  of  1982-2006. 


The  box  plots  in  Figure  17  support  what  many  previous  authors 
have  found,  that  weak  to  positive  low-level  relative  vorticity  relates  to  an  increase 
in  TC  formation  probability.  Not  shown  are  similar  sets  of  plots  for  planetary 
vorticity  and  absolute  vorticity.  In  agreement  with  previous  studies,  we  find  that 
Coriolis  parameter  has  a  positive  relationship  with  TC  formation,  and  that  the 
vast  majority  of  TCs  form  several  degrees  or  more  from  the  equator. 

3.  Model  Variable  Selection 

Several  of  the  variables  that  are  either  directly  available  from  the  CFS  or 
are  easily  derived  from  CFS  output  represent  similar  large-scale  environmental 
conditions  and  processes.  Thus,  to  represent,  for  example,  vertical  motion,  we 
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had  to  choose  between  200  mb  divergence  and  200  mb  relative  vorticity,  since 
both  of  these  variables  represent  vertical  motion  (the  former  does  so  explicitly 
and  the  latter  does  so  implicitly).  In  making  this  sort  of  choice,  we  favored 
variables  that: 

1 )  have  physically  plausible  relationships  to  TC  formation, 

2)  are  readily  available  directly  from  the  CFS, 

3)  or  are  easily  derivable  from  available  CFS  variables,  and 

4)  are  relatively  skillfully  predicted  by  the  CFS. 

4.  Climate  Oscillations  and  Model  Variable  Relationships 

Numerous  prior  studies  describe  the  intraseasonal  and  interannual 
variability  of  TC  formation,  especially  as  they  relate  to  climate  oscillations  (e.g., 
Ford  2000;  Chan  2004).  Of  the  climate  oscillations  that  impact  TC  activity,  the 
most  often  investigated  are  El  Nino  and  La  Nina  (ENLN).  ENLN  are  anomalous 
oscillations  of  the  tropical  atmosphere  and  ocean  that  can  alter  the  large-scale 
environment  in  ways  that  influence  TC  formations,  intensities,  and  tracks  (e.g., 
Ford  2000).  Wang  and  Chan  (2002)  offer  a  good  illustration  of  how  ENLN  can 
influence  TC  activity.  They  note  that  during  the  latter  months  of  an  El  Nino  year, 
low-level  anomalous  westerlies  encompass  much  of  the  WNP.  These 
anomalous  winds  lead  to  positive  relative  vorticity  anomalies  in  the  region,  which 
provide  a  favorable  environment  for  TC  formation  that  is  both  later  in  the  year 
than  normal,  and  displaced  farther  to  the  east. 

In  addition  to  ENLN,  much  focus  has  been  directed  at  the  influence  of 
intraseasonal  tropical  oscillations;  for  example,  investigations  into  the  effects  of 
the  MJO  on  TC  formation  in  the  WNP.  Frank  and  Roundy  (2006)  show  that 
when  MJO  activity  is  high,  TCs  are  more  likely  to  form  in  the  convectively  active 
portions  of  the  MJOs.  As  with  ENLN,  it  is  likely  that  the  impacts  of  intraseasonal 
variations  on  TCs  occur  mainly  via  alterations  of  the  LSEFs. 
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Though  climate  oscillations  and  their  impacts  on  TC  activity  are  beyond 
the  scope  of  this  thesis,  these  brief  notes  are  included  because  of  their 
relationship  to  the  subject  of  this  thesis:  intraseasonal  prediction  of  tropical 
cyclogenesis.  With  changes  in  the  large-scale  circulation  in  the  tropics,  the 
thermodynamic  and/or  dynamic  genesis  parameters  may  be  modified;  these 
modifications,  in  turn,  alter  the  TC  activity.  The  idea  is  that  if  oscillations  (ENLN, 
MJO,  etc.)  that  have  been  shown  to  impact  TC  formation  are  skillfully  predicted 
by  the  CFS,  including  the  variations  in  the  LSEFs  associated  with  these  climate 
oscillations,  then  a  statistical-dynamical  method  based  on  the  relationships 
between  the  LSEFs  and  TC  formations  can  be  skillful  regardless  of  the  oscillatory 
state  of  the  climate  system. 


D.  PROBABILISTIC  EQUATION  DEVELOPMENT 
1.  Logistic  Regression 

Logistic  regression,  also  referred  to  as  logit  regression,  is  an  appropriate 
statistical  tool  for  this  application.  Given  a  combination  of  independent  variables, 
logistic  regression  provides  the  probability  of  occurrence  of  the  dependant,  binary 
variable.  Let  pF  be  the  probability  of  TC  formation  at  a  given  grid  point  for  a 

given  time  period  (one  day,  in  this  study);  since  pF  is  a  probability  it  is  bounded 
by  zero  and  one. 


The  natural  logarithm  of  the  odds  ratio  of  the  probability  is  called  logit, 


where: 


Logit  =  In 


f  PF  A 


K]-Pfj 

We  used  the  statistical  analysis  software  S-Plus  to  find  the  optimal  values 
of  the  intercept  b0  and  the  coefficients  bk  for  each  contributing  variable  xk ,  such 
that: 
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Then  the  probability  of  TC  formation  may  be  calculated  based  on  a  linear 
combination  of  the  optimal  value  coefficients  and  explanatory  variables: 

e(b0+blxl+...+bkxk) 

Pp  j  +  c,0>t)ibx,  ) 

For  more  information  regarding  logistic  regression,  the  reader  is  directed 
towards  Wilks  (2006),  Devore  (200),  or  most  any  college-level  statistics  text. 

a.  Dependant  Variable 

Within  the  framework  of  logistic  regression,  TC  formation  at  a  given 
grid  point  is  modeled  as  a  binary  response  variable  and  is  expressed  as  either 
zero  (no  formation)  or  one  (formation  observed).  As  such,  this  approach 
provides  the  model  with  no  information  as  to  the  strength  or  duration  of  the 
storm.  We  feel  that  this  approach  remains  viable  despite  this  limitation.  McBride 
(1981)  comments,  with  respect  to  compositing  data,  that  “the  averaging  process 
smears  out  the  diversity  between  different  systems  and  enhances  features  in 
common”  As  such,  we  hope  our  method  is  applicable  over  more  scenarios,  as  a 
result  of  including  a  wide  variety  of  storms  in  the  training  of  the  regression  model. 

b.  Independent  Variables 

Technically,  the  approach  we  are  using  is  multivariate  logistic 
regression,  as  we  are  allowing  multiple  independent,  or  explanatory,  variables  to 
contribute  to  the  probability.  Ideally,  all  the  independent  variables  in  a  multiple 
logistic  regression  analysis,  would  be  just  that,  independent.  As  noted  earlier, 
the  LSEFs  are  inter-related  in  a  linked  ocean-atmosphere  system,  thus  the 
variables  will  all  have  some  degree  of  correlation  with  each  other.  This  lack  of 
true  independence  will  allow  combinations  of  variables  to  negate  the  need  for 
others.  For  example,  high  relative  humidity  often  occurs  in  regions  of  warm  SST, 
positive  low-level  vorticity,  and  upward  vertical  motion.  Therefore,  if  the  latter 
three  variables  favorably  exist,  the  addition  of  a  humidity  variable  may  not  be 
required  to  ascertain  the  favorability  of  the  large-scale  environment. 
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2.  Model  Training 

Statistical  methods,  such  as  logistic  regression,  predict  the  response  to 
variables  based  on  a  historical  record;  therefore,  one  must  reach  a  balance 
between  the  length  and  the  quality  of  the  climate  record.  For  this  reason,  we 
utilize  data  only  from  the  satellite  era.  In  developing  such  statistical  tools,  one 
must  also  assume  a  degree  of  stationarity  of  the  climate  system,  which  we  know 
is  not  entirely  the  case. 

We  used  R2  and  OISST  data  to  train  our  statistical  model;  the  availability 
of  both  of  those  datasets  limited  us  to  the  years  1982  -  2006,  inclusive.  Various 
forms  of  the  model  were  tested,  some  of  which  were  trained  over  the  entire  year, 
others  were  trained  over  just  the  peak  formation  season. 

When  a  model  is  trained  over  all  months  for  the  25-year  period,  the  size  of 
the  dataset  becomes  somewhat  cumbersome  [13  (latitude  grid  boxes  in  WNP)  x 
37  (longitude  grid  boxes)  x  365  (days,  excluding  leap  days)  x  25  (years)  = 
4,389,125  day  grid  points  per  variable!].  One  approach  for  the  reducing  the 
needed  dataset  is  to  include  all  the  points  wherein  a  TC  was  observed,  but  only 
include  a  portion  of  the  remaining  “non-occurrence”  points.  We  refer  to  the  data 
from  all  the  day  grid  points  at  which  a  TC  was  not  observed  as  non-TC 
information  (NTCI).  Various  forms  of  the  model  were  tested  using  various 
amounts  (as  percentages)  of  NTCI. 

3.  Model  Selection 

We  made  use  of  a  series  of  tests  to  ensure  our  model  was  statistically 
sound  and  to  assess  the  overall  goodness  of  fit.  Our  goal  was  to  develop  a 
model  that  is  physically  defensible,  stable,  and  reliable.  The  following  tools  are 
among  those  we  relied  upon  to  select  our  model  from  the  numerous  forms  of  the 
model  that  we  tested. 
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a. 


Aka  ike  Information  Criterion 


The  Akaike  Information  Criterion  (AIC)  is  a  goodness-of-fit  measure 
that  seeks  to  find  a  balance  between  model  fit  and  complexity.  The  model 
complexity  is  handled  by  imposing  a  penalty  for  the  number  of  terms  included  in 
the  equation.  A  lower  AIC  suggests  a  better-fitting  model.  Refer  to  Wilks  (2006) 
or  Burnham  and  Anderson  (2002)  for  more  information  concerning  AIC. 

b.  Deviance 

We  also  used  the  residual  deviance  numbers  to  compare  models. 
In  a  simplistic  manner,  the  amount  of  deviance  explained  by  a  model  suggests 
how  much  of  the  variability  is  accounted  for  by  the  combination  of  the  included 
terms  of  the  model.  The  logic  for  this  test  being  that  the  greater  the  goodness  of 
fit  of  a  model,  the  lower  the  residual  deviance  associated  with  that  model. 

c.  Stability 

To  assess  whether  our  model  contains  too  many  explanatory 
variables,  often  referred  to  in  statistics  as  being  overfit,  we  examined  how  much 
the  variable  coefficients  vary  when  the  model  is  constructed  over  different 
training  periods.  A  model  is  said  to  be  more  stable,  and  having  a  lower 
probability  of  being  overfit,  the  less  the  coefficients  vary  when  derived  from 
different  training  periods. 

d.  Physical  Plausibility 

A  viable  model  must  indicate  relationships  that  fit  the  conceptual 
models  identified  in  prior  studies  and  noted  in  the  section  on  LSEFs.  For 
example,  we  expect  SSTs  to  have  a  positive  relationship  with  the  likelihood  of  TC 
formation.  A  model  that  suggests  a  negative  relationship  between  SST  and  the 
likelihood  of  TC  formation  would  be  suspect.  In  our  research,  we  encountered 
models  that  suggested  humidity  and  the  probability  of  TC  formation  are  inversely 
related;  such  a  negative  coefficient  is  not  physically  defensible  and  likely  results 
from  multicollinearity  between  the  LSEFs  included  in  the  model.  This 
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multicollinearity  may  result  from  a  lack  of  independence  among  predictor 
variables,  in  this  case  between  humidity  and  the  other  variables  included  in  the 
regression  equation  (e.g.,  SST,  divergence;  see  also  Devore  2000). 

4.  Model  Verification 

We  made  use  of  several  key  metrics  for  assessing  the  skill  and  value  of 
our  model  when  the  model  was  used  to  conduct  multi-year  zero-lead  hindcasting 
and  non-zero  lead  hindcasting  case  studies.  Such  metrics  include  the  number  of 
hits  and  misses,  the  Brier  score  (BS)  and  Brier  skill  score  (BSS),  the  reliability 
diagram,  the  relative  operating  characteristic  (ROC)  curve,  and  the  economic 
value  diagram  (EVD). 

5.  Motivations  for  a  Probabilistic  Forecast 

Among  the  reasons  for  selecting  multivariate  logistic  regression  as  the 
statistical  tool  by  which  to  develop  a  statistical-dynamical  prediction  method  are 
the  potential  benefits  of  producing  probabilistic  forecasts.  In  order  to  reap  these 
benefits,  the  probabilities  must  represent  true  probabilities.  The  probabilities  may 
not  be  true  probabilities  if  the  model  is  ill  constructed.  Among  the  potential 
benefits  of  probabilistic  forecasts,  is  that  customers  may  use  the  true  probabilities 
to  compare  to  the  risk  profile  of  a  given  mission  and,  thereby  adjust  their  decision 
making.  Also,  such  probabilistic  output  allows  for  a  relatively  straightforward 
conversion  to  anomaly-type  forecasts  that  may  be  useful  deliverable  for  many 
decision  makers. 

E.  SUMMARY  OF  PREDICTION  METHOD 

Figure  18  is  a  schematic  of  the  process  involved  in  creating  and 
operationalizing  the  prediction  process  used  in  this  thesis.  This  process  is  a 
combined  statistical-dynamical  one,  wherein  one  uses  a  numerical  model  to  force 
a  statistical  model  to  generate  ensemble  based,  probabilistic,  intraseasonal 
predictions  of  TC  formations. 
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Figure  18.  Depiction  of  the  process  for  generating  intraseasonal  predictions  of 

tropical  cyclogenesis. 
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III.  RESULTS 


A.  REGRESSION  MODEL 

The  underlying  goal  in  generating  a  regression  model,  is  to  construct  an 
equation  for  the  probability  of  TC  formation  for  individual  day  grid  points  based 
on  the  values  of  corresponding  atmospheric  and  oceanic  variables.  Multivariate 
logistic  regression  was  used  to  find  optimal  values  of  the  intercept  b0 and  the 

coefficients  bk  for  each  contributing  variable  xk ,  such  that  the  probability  of  TC 
formation  pF  at  any  given  day  grid  point  is: 

g(b0+blxl+...+b6.x6) 

_  |  +  e(VVl+-.-+*6*6) 

Table  1  below  lists  the  variables  and  coefficients  that  are  included  in  the 
model  we  chose.  The  paragraphs  that  follow  highlight  some  of  key  details  as  to 
how  this  model  was  constructed  and  why  it  was  chosen  from  amongst  the  many 
model  permutations  tested. 

Table  1 .  Variable  coefficients  and  related  statistics,  generated  over  a  June- 
November  training  period  for  the  years  1982-2006. 


Variable 

Regression 

Coefficient 

Significance 

Rank 

Standard 

Error 

t  Value 

- 

(Intercept) 

bo 

-27.41179 

- 

1.81639 

-15.09 

X 1 

850mb  Rel.  Vorticity 

bi 

167645.1 

1 

7074.82 

23.69 

x2 

850mb  Rel.  Vorticity^ 

b2 

-1679802094.0 

2 

112033900 

-14.99 

X3 

SST 

b2 

0.6567593 

3 

0.06061 

10.83 

X4 

Vertical  Wind  Shear 

b4 

-0.05990173 

4 

0.00687 

-8.71 

X5 

Coriolis  Parameter 

bs 

15861.34 

5 

2646.58 

5.99 

X6 

200  mb  Divergence 

bs 

24729.49 

6 

6152.83 

4.01 

As  one  can  see,  the  variables  selected  for  inclusion  in  the  model  include: 
850  mb  relative  vorticity,  850  mb  relative  vorticity  squared,  SST,  vertical  wind 
shear,  Coriolis,  and  200  mb  divergence.  The  magnitude  of  the  regression 
coefficients  are  not  indicative  of  the  relative  importance  of  that  term,  but  are 
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reflections  of  the  units  of  the  variable.  In  addition,  the  units  of  the  coefficients  are 
the  inverse  of  the  units  of  the  associated  variable,  thus  the  linear  combination  of 
the  variables  and  their  coefficients  is  unitless.  One  may  also  note  what  is  not 
included  in  this  equation  that  appears  in  the  original  listing  of  genesis  parameters 
by  Gray  (1975),  that  being  a  term  representing  mid-level  humidity.  The  results 
from  statistical  testing  indicated  a  significant  degree  of  multicollinearity  between 
such  a  moisture  variable  and  the  other  terms  of  the  equation.  The  exclusion  of  a 
moisture  variable  is  not  to  say  it  is  not  important  for  the  formation  of  TCs,  but 
rather  that  the  combination  of  the  other  variables  (cyclonic  low-level  circulation 
over  warm  ocean  water,  etc.)  act  as  a  suitable  proxy  for  a  moisture  variable. 

Of  the  included  variables,  SST  is  the  only  one  directly  available  from  the 
CFS.  The  Coriolis  parameter  is  a  function  of  latitude,  and  thus  requires  no  model 
input.  The  remaining  variables  are  all  calculated  from  the  200  mb  and  850  mb 
zonal  and  meridional  winds,  which  are  available  from  the  CFS.  Despite  the  need 
for  these  calculations,  we  feel  that  these  variables  are  likely  more  predictable 
within  the  CFS  than  other  variables  that  are  more  dependent  upon 
parameterizations.  For  example,  a  variable  for  precipitation  rate  within  the  CFS 
would  be  highly  dependent  upon  the  convective  parameterization  scheme; 
whereas,  the  upper-level  component  winds  are  based  more  on  observational 
data  assimilated  directly  into  the  model  and  integrated  via  the  primitive 
equations. 

As  aforementioned,  a  key  factor  in  selecting  a  regression  model  is  to 
ensure  physically  plausibility.  All  the  included  variables  have  been  shown  to  be 
influential,  or  are  known  to  be  closely  related  to  variables  that  have  been  shown 
influential,  in  tropical  cyclogenesis.  In  addition,  the  sign  on  each  coefficient  fits 
the  conceptual  model  of  that  variable’s  relationship  to  TC  formation.  More 
specifically,  low-level  relative  vorticity,  Coriolis  parameter,  SST,  and  upper-level 
divergence  each  have  positive  coefficients,  and  an  increase  in  one  or  all  of  those 
variables  translates  into  a  more  favorable  environment  for  TC  formation.  The 
negative  coefficient  on  the  vertical  wind  shear  term  indicates  that  the  lower  the 
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vertical  wind  shear,  the  more  favorable  TC  formation.  The  negative  coefficient 
on  the  squared  vorticity  term  is  plausible  as  well,  as  described  later  in  this 
section. 

The  significance  ranks  listed  in  Table  1  are  based  on  the  probability  that 
the  given  term  is  not  significant  to  the  performance  of  our  model  per  the  Chi- 
squared  test.  These  rankings  may  be  interpreted  as  indicating  that  850  mb 
relative  vorticity  has  the  lowest  probability  of  not  being  significant  to  our  model, 
and  thus  may  be  viewed  as  the  most  statistically  influential  component  of  the 
model.  All  of  the  terms  included  in  this  model  are  statistically  significant  to  the 
regression  model;  therefore,  even  though  the  200  mb  divergence  term  has  the 
lowest  significance  rank,  it  is  still  a  significant  contributor  to  the  model. 

An  issue  that  plagued  the  development  of  this  model  was  the  persistence 
of  storms  after  the  formation  day.  In  developing  the  model,  we  assigned  a  hit,  or 
occurrence  value  of  one,  to  the  day  grid  point  at  which  the  JTWC  best  track  data 
placed  the  formation  point  for  each  given  storm.  As  the  LSEFs  appeared  to  vary 
little  from  the  day  of  formation  to  the  days  immediately  surrounding  the  formation 
day,  the  regression  model  was  forced  to  discern  the  difference  in  the  LSEFs 
between  those  days,  in  essence  asking  why  was  one  day  a  hit  and  the  following 
day — with  nearly  identical  LSEFs — a  non-occurrence  point?  To  make  matters 
worse,  the  R2  data  used  in  the  training  of  the  models  often  depicts  the  storm 
tracks  well.  Therefore,  the  LSEFs  following  the  JTWC  formation  date  were  often 
more  favorable  than  on  the  formation  date.  This  is  especially  true  for  the 
dynamic  variables.  As  a  result,  we  needed  a  way  to  focus  the  model  in  on  the 
day  of  formation  and  introduce  a  variable  or  mechanism  to  the  model  to  identify 
when  a  well-developed  storm  is  being  depicted  by  the  assimilated  reanalysis 
fields. 

We  adopted  three  modifications  in  the  construction  of  this  model  to  focus 

on  the  formation  day  and  to  reduce  the  model  predicted  probabilities  associated 

with  storms  that  have  already  formed.  First,  we  adopted  a  mean  sea  level 

pressure  (MSLP)  filter.  Before  including  a  non-occurrence  day  grid  block  into  the 
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regression,  we  filtered  out  blocks  for  which  the  MSLP  was  less  than  990  mb. 
Second,  we  reduced  the  NTCI  to  40%;  this  allowed  us  to  randomly  eliminate 
some  blocks  associated  with  storms  that  have  already  formed,  while  still  retaining 
784,660  non-occurrence  points  in  the  development  of  the  regression  model. 
Third,  we  added  the  squared  850  mb  relative  vorticity  term. 

The  squared  850  mb  relative  vorticity  term  forces  a  non-linear  response  to 
the  850  mb  relative  vorticity  in  the  generalized  linear  model.  Of  all  the  LSEFs, 
the  low-level  vorticity  appears  to  change  the  most  through  the  life  of  a  TC.  In 
order  to  focus  our  model  in  the  formation  day — rather  than  when  a  storm  is  a 
well-developed  circulation  center — we  included  this  vorticity  squared  term  into 
the  regression  model.  With  its  negative  coefficient,  this  term  acts  to  decrease  the 
probability  of  TC  formation  as  the  relative  vorticity  increases.  In  essence,  we  are 
attempting  to  decrease  the  likelihood  of  formation  in  regions  where  a  TC  already 
exists. 

Other  variables  considered  for  inclusion,  but  not  appearing  in  the  final 
form  of  the  model  include,  but  are  not  limited  to,  200  mb  relative  vorticity, 
thickness,  MSLP,  precipitable  water,  and  850  mb  divergence.  We  also 
entertained  the  inclusion  of  combinations  of  several  variables,  such  as  absolute 
vorticity  rather  than  relative  vorticity  and  the  Coriolis  parameter  separately,  and  a 
combined  upper-level  minus  low-level  divergence  term. 

The  final  form  of  the  model  outlined  in  Table  1  was  trained  only  on  the 
peak  formation  period,  June  through  November,  for  the  years  1982  through 
2006.  To  evaluate  the  stability  of  the  model  (see  Section  II.D.3.C),  we  developed 
the  regression  equation  several  times,  each  time  excluding  one  year  from  the 
training  period.  Table  2  lists  the  coefficients  from  some  of  these  runs.  The 
variations  in  the  coefficients  are  minor;  therefore,  we  concluded  that  our  model  is 
stable  and  not  overfit.  Excluding  years  also  provided  us  with  years  of 
independent  data  (years  over  which  the  model  was  not  trained)  with  which  to 
conduct  additional  verification. 
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Table  2.  Comparison  of  regression  coefficients  for  models  with  altered  training 
periods.  The  training  period  for  the  full  model  is  all  years  during  1982- 

2006. 


Variable 

Full  Model 

Excluding  1982 

Excluding  1997 

Excluding  2001 

Excluding  2006 

(Intercept) 

-27.41179 

-27.51922 

-27.44453 

-27.40436 

-27.17824 

Rel.  Vorticity 

167645.1 

164535.9 

165662.7 

166313 

171080.2 

Rel.  Vorticity2 

-1679802094 

-1641682993 

-1638789695 

-1668518817 

-1770825439 

SST 

0.6567593 

0.661396 

0.6575657 

0.6569342 

0.6482437 

Shear 

-0.05990173 

-0.06006378 

-0.05951317 

-0.05778334 

-0.05813202 

Coriolis 

15861.34 

16133.2 

16637.09 

15938.68 

16204.37 

Divergence 

24729.49 

26888.63 

24222.44 

23897.41 

23509.69 

This  model  was  trained  on  data  with  daily  temporal  resolution,  which 
poses  two  potential  challenges.  As  TCs  are  rare  events — 626  formations  from 
among  785,286  day  grid  blocks  in  the  training  period — the  daily  probability  of  TC 
formation  is  incredibly  low.  This  is  true  even  for  the  most  favorable  locations 
(i.e.,  the  climatological  position  of  the  monsoon  trough)  and  times  of  year.  For 
example,  daily  probabilities  during  the  height  of  the  peak  formation  period  in 
favorable  regions  seldom  exceed  0.05,  or  5%.  Such  low  probabilities — even  if 
the  probabilities  are  reliable — may  be  a  challenge  for  forecasters  and  operators 
to  interpret.  In  addition,  at  daily  scales,  the  predictability  at  intraseasonal  leads  of 
the  variables  included  within  this  model  tends  to  be  low. 
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Probability:  R2  2001_264 


Figure  19.  Example  of  contoured,  seven-day  summed  probabilities,  centered 
about  the  264th  day  (21  September)  of  2001 ,  constructed  from  R2  and 
OISST  fields  using  the  model  described  above.  The  red  dot  indicates  the 
verification  point  for  a  TC  that  formed  on  21  September  2001 . 


In  order  to  address  these  problems  concerning  daily  probabilities,  we 
investigated  non-native  versions  of  the  probabilistic  output.  The  version  upon 
which  we  settled  was  a  summed  seven-day  probability.  When  using  TC 
formations  to  verify  this  model  output,  we  compared  the  TC  formation  date  and 
location  to  the  sum  of  the  output  probabilities  for  the  seven  days  centered  on  the 
formation  day:  the  three  days  prior  to  formation,  the  day  of  formation,  and  the 
three  days  following  formation.  Figure  19  is  an  example  of  seven-day  summed 
probabilities  from  a  hindcast  valid  21  September  2001;  the  days  summed  to 
create  this  plot  are  18  September  through  24  September.  The  subsequent  plot 
(not  shown)  would  be  valid  on  22  September  2001,  and  be  the  summation  of  19 
September  to  25  September  daily  probabilities.  The  reasons  for  favoring  this 
seven-day  summation  were  threefold.  First,  the  probabilities  of  formation  at  daily 
time  steps  are  small  due  to  the  rarity  of  TC  formation,  so  the  summation 
increases  the  probabilities  in  active  grid  blocks  to  values  that  may  be  used  in 
decision-making  by  users.  Second,  the  daily  output  of  summed  seven-day 
probabilities  should  enhance  the  predictability  within  the  model,  as  it  reduces  the 
potential  impacts  of  timing  error  within  the  forecast  fields,  and  provides  a  better  fit 
with  the  time  averaging  approach  that  tends  to  enhance  the  skill  of  long  lead 
forecasts.  Third,  this  approach  enhances  the  usefulness  as  a  planning  product; 


46 


for  example,  if  operators  are  planning  an  intraseasonal  lead  times  multi-day 
transits  of  the  WNP,  a  multi-day  probability  forecast  may  be  a  better  match  to  the 
planning  process.  The  probabilities  shown  in  the  remainder  of  this  thesis  are 
seven-day  summed  probabilities,  unless  otherwise  specified. 

B.  VERIFICATION  OF  THE  REGRESSION  MODEL 

As  depicted  in  Figure  1  of  this  thesis,  Verification/Evaluation  is  a  vital  step 
in  the  climate  prediction  process.  Such  verification  and  evaluation  is  required  for 
two  primary  reasons:  to  identify  potential  shortfalls  or  weaknesses  that  may  be 
corrected  be  re-doing  the  model  development  stage,  and  to  ascertain  the 
potential  skill  and  value  the  method  offers  potential  users. 

In  our  verification  of  this  regression  model,  we  faced  two  complicating 
factors.  First,  we  are  actually  predicting  the  favorability  of  the  large-scale 
environment  to  support  TC  formations  not  formations  themselves.  This 
shortcoming  returns  to  the  idea  that  the  LSEFs  used  in  the  model  are  necessary, 
but  may  not  be  sufficient,  as  noted  in  Section  II. C.  This  complication  arises  when 
one  uses  actual  TC  occurrences  to  verify  what  are  essentially  forecasts  for  the 
propensity  for  TC  formation  based  on  environmental  factors.  The  second 
complicating  factor  is  that  few  techniques  exist  to  verify  spatial-distributed 
predictions  of  events  that  are  as  rare  as  TC  formations. 

Other  organizations  that  are  delving  into  the  realm  of  intraseasonal  climate 
prediction  appear  to  be  struggling  with  verification  as  well.  With  no  standard 
approach  as  to  how  to  verify  such  predictions,  we  feel  the  best  approach  to 
verification  is  to  use  several  methods  in  concert. 

1.  Quantitative  Verification 

The  first  class  of  verification  we  explored  was  quantitative  verification.  For 
the  sake  of  brevity,  we  note  only  the  key  points  for  each  quantitative  verification 
technique.  The  reader  is  directed  to  references,  such  as  Wilks  (2006)  or  Eckel 
(2008),  for  additional  details  on  the  construction  and  interpretation  of  these 
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verification  techniques.  Paramount  in  quantitative  verification  is  having  sufficient 
forecasts  to  verify.  In  order  to  encompass  a  sufficient  number  of  storms,  the 
verification  in  this  section  is  for  multi-year  zero-lead  hindcasts  over  dependant 
data.  The  period  of  verification,  unless  otherwise  specified,  is  the  June  through 
November  peak  formation  period,  as  this  matches  the  period  over  which  the 
model  was  trained  and  reduces  the  potential  for  data  dilution  from  the  months 
when  few  storms  develop.  Over  this  period  for  the  years  1982  to  2006,  626 
storms  were  identified  by  the  JTWC  in  the  region  we  define  as  the  WNP,  versus 
752  storms  if  the  verification  period  is  expanded  to  encompass  every  day  of  the 
year  for  the  same  years.  So  relatively  few  TCs  were  left  out  of  the  verification 
process  when  we  limited  ourselves  to  verifying  using  just  June-November  TCs. 

Many  of  the  quantitative  verification  techniques  that  follow  are  based  on 
dichotomous  observation  values;  a  value  of  one  if  the  event  is  observed,  or  zero 
if  the  event  is  not  observed.  For  our  verifications,  we  opted  to  credit  an  observed 
value  to  any  grid  point  that  fell  at  or  within  a  2.5°  radius  of  the  JTWC  formation 
point.  We  feel  a  2.5°  radius  about  the  formation  point  accounts  for  the  spatial 
influence  of  a  forming  TC,  as  well  as  accounts  for  some  of  the  uncertainty  in  the 
formation  location  in  the  JTWC  best  track  data. 

To  provide  us  with  a  standardized  measure  of  performance  based  on  our 
predictions  of  the  probability  of  formation,  we  used  the  Brier  skill  score  (BSS). 
Over  the  peak  formation  season,  our  model  results  in  a  BSS  of  0.029055 
(0.02821 1  ...0.02994).  The  ranges  included  in  the  parentheses  represent  a  95% 
confidence  interval,  generated  through  jackknifing  each  of  the  years  in  the 
training  period.  Recall  that  positive  values  of  the  BSS  represent  improvement 
over  the  sample  climatology  baseline;  thus,  our  model  shows  notable  skill.  When 
verifying  the  model  over  the  full  year,  the  BSS  increases  to  0.032555 
(0.031 927... 0.0331 82).  Eckel  (2008)  notes  that  BSS  is  vulnerable  to  dataset 
dilution,  which  likely  accounts  for  this  increase  in  the  skill  score  when  verifying 
over  the  entire  year. 
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Figure  20.  Reliability  diagram  (left)  and  bin  histogram  (right)  generated  with 
minimum  bin  intervals  of  0.005  for  the  zero-lead  hindcasts,  from  the  model 
outlined  in  Table  1  over  the  June  -  November  period  for  1982  to  2006; 
error  bars  represent  a  95%  confidence  interval. 


Figure  21 .  The  same  reliability  diagram  as  in  Figure  20,  but  focused  in  on  the 
lower  probabilities;  error  bars  represent  a  95%  confidence  interval. 

49 


Figures  20  and  21  depict  the  reliability  diagrams  for  zero-lead  hindcasts 
with  the  model  outlined  in  Table  1,  for  the  peak  formation  seasons  of  the  years 
1982  through  2006.  The  reliability  diagram  for  a  perfectly  reliable  model  would 
lie  along  the  diagonal  indicated  by  the  dashed  line.  Points  within  the  region 
defined  by  the  solid  lines  indicate  positive  skill.  The  figures  show  graphically 
what  we  learned  from  the  BSS,  that  this  model  exhibits  skill  over  the  sample 
climatology  baseline.  The  line  connecting  the  results  points  is  above  the 
diagonal,  indicating  that  the  model  slightly  underforecasted  TC  formations.  The 
sporadic  behavior — as  captured  by  the  error  bars — in  the  “higher”  probabilities  is 
likely  due  to  the  drop  in  number  of  points  in  those  bins.  From  these  reliability 
diagrams,  we  obtain  an  approximate  BSS  of  0.02852,  reliability  of  0.000065693, 
resolution  of  0.00032856,  and  uncertainty  of  0.0092171 . 


Figure  22.  ROC  diagram  for  the  zero-lead  hindcasts  over  the  peak  formation 

season  for  the  years  1982  to  2006. 

In  addition  to  having  skill,  a  worthwhile  predictive  method  must  also  offer 
utility  and  value  to  the  user.  The  relative  operating  characteristic  (ROC)  diagram 
and  economic  value  diagram  (EVD)  are  two  graphical  tools  that  one  may  use  to 

50 


ascertain  whether  a  method  may  offer  such  value  and  utility.  Figure  22  shows 
the  ROC  diagram  for  the  zero-lead  hindcasts  over  the  aforementioned 
verification  period.  A  diagonal  line  (not  shown)  connecting  (0,0)  with  (1,1)  would 
represent  zero  resolution  or  no  discrimination.  Forecasts  with  better 
discrimination  have  ROC  curves  approaching  the  upper-left  corner  of  the 
diagram  (Wilks  2006).  As  a  result,  one  can  see  that  the  model  exhibits  fair 
discrimination  and  offers  potential  utility  to  the  user.  Along  with  the  ROC 
diagram,  one  may  calculate  a  ROC  skill  score  (ROCSS),  which  has  a  value  of 
one  for  a  perfect  forecast  and  is  less  than  zero  if  the  forecast  is  worse  than  the 
sample  climatology  forecast.  The  ROCSS  for  these  hindcasts  is  0.68325. 


Figure  23.  EVD  for  the  zero-lead  hindcasts  over  the  peak  formation  season  for 

the  years  1982  to  2006. 

An  EVD,  as  shown  in  Figure  23,  plots  value  score  versus  cost/loss  (C/L) 
ratio,  and  is  a  representation  of  the  potential  value  added  by  following  the 
forecast  guidance  for  each  customer  (as  defined  by  their  C/L  ratio).  While  initially 
one  may  not  be  impressed  by  the  EVD  in  Figure  23  due  to  its  skew,  this  EVD 
actually  depicts  significant  potential  value  for  risk  adverse  customers.  Whether  a 
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customer  defines  their  C/L  ratio  in  terms  of  dollars,  sortie  hours,  or  crew  morale, 
most  customers  would  be  risk  adverse  (low  C/L  ratio)  to  a  hit  by  a  TC.  A 
hypothetical  example  may  be  in  order.  Let  us  imagine  a  cruiser  is  steaming 
towards  Subic  Bay  and  the  forecast  calls  for  a  TC;  the  captain  can  either  divert 
around  the  storm  at  an  additional  cost  of  $100,000  above  and  beyond  typical 
operating  costs.  Alternately,  the  captain  may  maintain  course  and  if  the  cruiser  is 
hit  may  suffer  damages  worth  $1M  in  equipment  and  lost  time.  With  these 
numbers  this  customer  would  have  a  C/L  ratio  of  $1 00, 000/$1 ,000,000  or  0.1, 
and  thus  should  be  highly  risk  adverse.  For  such  a  customer,  the  EVD  indicates 
that  the  model  has  the  potential  to  be  very  valuable  in  mission  planning.  While 
this  example  is  grossly  oversimplified,  it  reveals  in  the  basic  idea  associated  with 
the  EVD  and,  thus,  the  potential  benefits  of  this  model. 

2.  Qualitative  Verification 

While  qualitative  verification  is  often  not  as  definitive  as  quantitative 
verification,  it  does  offer  the  advantage  of  allowing  us  to  verify  using  purely 
independent  data.  Options  for  independent  data  include  using  runs  generated 
with  a  year  left  of  out  of  model  development,  then  verifying  over  that  excluded 
year,  or  using  another  variable  source  (such  as  R1  data  for  atmospheric 
variables). 


Figure  24.  Example  of  contoured,  seven-day  summed  probabilities,  centered 
about  the  236th  day  (24  August)  of  2001 ,  constructed  from  R2  and  OISST 
fields.  The  red  dot  indicates  the  verification  point  for  a  TC  that  formed  on 

24  August  2001 . 
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Figure  24  is  a  contour  plot  of  probabilities  for  the  period  of  21-27  August 
2001.  The  model  used  in  generating  this  plot  was  trained  over  a  period  that 
excluded  the  year  2001;  therefore,  this  plot  was  generated  with  independent 
data.  Plots  such  as  that  in  Figure  24  indicate  that  the  methodology  proposed  in 
this  thesis  may  prove  beneficial,  as  this  zero-lead  hindcast  shows  “high” 
probabilities  that  resemble  those  expected  from  reverse  monsoon  trough 
conditions  that  are  very  different  from  those  that  would  be  expected  from  typical 
monsoon  trough  climatological  conditions  in  August. 


Figure  25.  Example  of  contoured,  seven-day  summed  probabilities,  centered 
about  the  309th  day  (5  November)  of  2001,  constructed  from  R2  and 
OISST  fields.  The  red  dot  indicates  the  verification  point  for  a  TC  that 
formed  on  5  November  2001 . 


Figure  25  shows  a  zero-lead  hindcast  in  which  the  pattern  of  model 
probabilities  resembles  the  pattern  that  might  be  expected  from  climatological 
monsoon  trough  conditions.  This  figure  represents  situations  in  which  the  model 
proability  patterns  are  similar  to  climatological  patterns.  But  even  when  the 
model  patterns  resemble  climatology,  the  model  may  add  value  by  providing  a 
more  accurate  prediction  of  the  magnitude  of  the  probabilities,  as  discussed  in 
the  next  section. 
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3.  Comparison  to  Climatology 

The  preceding  plots  provide  hope  as  to  the  potential  usefulness  of  our 
proposed  method  for  predicting  TC  formations.  One  may  wonder  how  this 
method  compares  to  climatology,  but  with  climatology  comes  the  question  of 
what  form  of  climatology  is  the  best  against  which  to  compare  our  method.  See 
Appendix  A  for  a  brief  discussion  on  the  various  forms  of  climatology  one  may 
select. 

The  idea  of  hits  and  misses  is  commonplace  in  verification,  and  one  we 
shall  use  here.  A  simple  subtraction  of  the  climatological  formation  probability 
from  the  hindcast  probability  at  every  day  grid  block  yields  a  difference  matrix. 
Using  the  JTWC  best  track  formation  points,  a  hit  (miss)  is  defined  as  occurring 
when  the  difference  at  the  day  grid  block  of  formation  is  positive  (negative). 
Scoring  over  the  years  1982  through  2006,  our  model  had  681  hits  and  81 
misses,  for  a  hit  rate  of  89%. 


Probability  Difference:  2001  _236 
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Figure  26.  Plot  of  the  difference  matrix  resulting  from  subtracting  climatological 
probabilities  from  hindcast  probabilities  centered  on  24  August  2001,  the 
same  day  used  in  Figure  24.  Green  dots  denote  the  formation  points  for 
the  four  storms  that  formed  with  in  the  seven-day  period  of  21  -  27  August 

2001. 


One  may  also  plot  this  difference  matrix;  Figure  26  is  an  example  of  such 
a  plot.  Warm  (cool)  colors  represent  regions  where  the  probabilities  from  the 
model  are  higher  (lower)  than  the  climatological  probabilities.  This  approach  is 
akin  to  an  anomaly  forecast,  where  the  positive  regions  may  be  interpreted  as 
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having  a  greater  than  normal  likelihood  of  TC  formation.  The  period  of  21-27 
August  2001 ,  which  is  depicted  in  Figures  24  and  26,  was  unusually  active  along 
20°N.  This  difference  product  highlights  this  activity,  but  it  also  indicates  that  the 
probability  of  formation  along  the  climatological  position  of  the  monsoon  trough  is 
lower  than  climatology  suggests.  In  some  cases,  knowing  that  formation  is  less 
likely  in  a  region  when  compared  to  climatology  may  be  just  as  beneficial  as 
knowing  that  formation  is  more  likely  in  some  other  region. 

4.  Climate  Oscillations 

Section  II. C. 4  briefly  introduced  the  impacts  of  ENLN  on  TC  formation.  If 
our  model  accurately  depicts  the  favorability  of  the  large-scale  environment,  then 
it  should  depict  a  shift  in  the  probabilities  associated  with  the  changes  in  the 
large  scale  environment  that  are  associated  with  ENLN. 


El  Nino  JASO  Composite  Probabilities 
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La  Nina  JASO  Composite  Probabilities 


Figure  27.  Average  daily  probabilities  for  the  JASO  period  from  composited  El 
Nino  years  (top)  and  La  Nina  years  (bottom). 
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Defining  ENLN  years  based  on  the  Oceanic  Nino  index  (ONI),  we  can  take 
1982,  1987,  1991,  and  1997  as  classic  (non-Modoki)  El  Nino  years  and  1985, 
1988,  1999,  and  2000  as  La  Nina  years.  Averaging  the  daily  probabilities  from 
the  zero-lead  hindcasts  over  the  July,  August,  September,  and  October  (JASO) 
period,  yields  the  probability  patterns  shown  in  Figure  27.  Note  the  shift  in  the 
highest  probability  regions  between  the  two  plots,  these  shifts  are  similar  to  those 
described  in  prior  studies  of  the  impacts  of  ENLN  on  TO  formations  (e.g.,  Ford 
2000).  For  example,  the  high  probabilities  that  extend  farther  to  the  east  during 
the  El  Nino  years  are  representative  of  the  eastward  shift  of  the  regions  of  warm 
water,  low-level  cyclonic  flow/convergence,  and  low  vertical  wind  shear  from  their 
climatological  positions.  In  contrast,  slightly  higher  probabilities  near  the 
Maritime  Continent  in  the  bottom  panel  of  Figure  27  are  due  to  the  westward  shift 
of  favorable  LSEFs  during  La  Nina  years. 

5.  Conditional  Climatologies 

Another  potential  use  for  our  model  that  emerged  during  this  research  was 
the  possibility  of  creating  conditional  climatologies  in  the  manner  of  constructed 
analogues.  The  underlying  idea  is  that  rather  than  generating  a  climatology  plot 
based  on  the  raw  formation  data,  we  could  generate  a  plot  based  on  model¬ 
generated  probabilities.  This  approach  could  be  as  basic  as  generating  an 
annual  climatology  based  on  LTM  conditions,  or  as  complex  as  conditioning 
based  on  time  of  year,  ENLN,  et  cetera. 


JASO  LTM  Daily  Probabilities 


Figure  28.  Probabilities  from  LTM  JASO  R1  and  OISST  variables.  The  red 
dots  indicate  the  formation  points  for  all  JASO  TCs  from  1971  -  2000. 
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Figure  28  depicts  contours  of  daily  probabilities  for  the  JASO  period, 
based  on  LTM  R1  and  LTM  OISST  LSEF  values  composited  over  the  years  1971 
-2000  and  1982  -  2000,  respectively.  The  period  of  1971  -  2000  is  used  for  this 
and  other  long-term  mean  conditions,  as  it  represents  the  current  World 
Meteorological  Organization  (WMO)  standard,  30-year  climatology  period.  This 
plot  is  not  a  perfect  representation  of  the  raw  climatology;  for  example,  the 
formation  points  clustered  around  25°N  and  165°E  are  not  captured  well  by  the 
contours  despite  the  density  of  storms  in  that  location. 


Constructed  7-Day  Probs  for  June  Entering  Classic  El  Nino 
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Figure  29.  Contoured,  seven-day  probabilities,  constructed  from  R2  and 
OISST  fields.  The  red  dots  indicate  the  verification  point  for  TCs  that 
formed  during  such  conditions. 

The  concept  of  a  constructed  analogue  is  combining  past  anomaly 
patterns  such  that  the  resulting  combination  reflects  the  desired  state  of  the 
climate  (van  den  Dool  2007).  As  an  example,  we  constructed  a  probability  plot 
for  the  month  of  June  when  the  climate  system  is  entering  into  an  El  Nino 
pattern.  Using  the  ONI,  such  conditions  were  met  during  the  years  1991,  1997, 
and  2002.  Averaging  the  probabilities  from  our  model  for  these  three  months 
(one  month  each  for  three  years)  and  dividing  to  give  us  seven-day  probabilities, 
results  in  what  is  depicted  in  Figure  29.  In  essence,  the  result  is  an  improved 
representation  of  expected  probabilities  for  a  week  in  the  month  of  June  when  an 
El  Nino  event  is  developing. 
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Though  this  method  is  not  without  limitation,  it  has  a  remarkable 
advantage  in  that  this  approach  does  not  require  dynamical  input.  As  a  result, 
this  constructed  analogue  approach  may  be  useful  for  providing  tropical  activity 
outlooks  at  extended  lead  times. 

6.  Verification  Against  Deep  Convection 

As  noted  in  the  beginning  of  the  section  on  verification,  many  of  the 
verification  methods  we  have  discussed  thus  far  are  problematic  because  they 
verify  against  observed  TC  formations,  even  though  the  model  predicts  the 
propensity  for  formation,  not  actual  formations.  Thus,  we  chose  to  also  verify 
against  outgoing  longwave  radiation  (OLR),  since  low  OLR  values  indicate  deep 
convection  and  thus  a  large-scale  environment  that  is  likely  to  be  favorable  for 
TC  formation. 


Probability  R2  2006_304 


170  190  210  230  250  270 


Figure  30.  Comparison  of  zero-lead  hindcast  probabilities  (top)  and  OLR 
(bottom).  OLR  image  provided  by  Physical  Sciences  Division,  Earth 
System  Research  Laboratory,  NOAA,  Boulder,  Colorado,  from  their  Web 
site  at  http://www.esrl.noaa.gov/psd/. 
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Figure  30  is  presented  as  an  example  of  model-derived  probabilities  for  28 
October  through  3  November  2006  and  the  corresponding  NOAA  Interpolated 
OLR.  Note  in  the  tropics  the  general  correspondence  between  the  higher 
probabilities  and  the  low  OLR  values  (cool  colors)  that  correspond  to  cold  high 
cloud  tops  and  deep  convection.  This  sort  of  correspondence  indicates  that  the 
model  is  capable  of  identifying  deep  convective  regions  that  are  favorable  for  TC 
formation,  and  has  the  potential  to  be  useful  in  intraseasonal  predictions  of 
tropical  convective  activity.  To  operationalize  such  an  approach  for  predicting 
convective  activity,  the  Coriolis  term  should  be  removed  from  the  regression 
model. 


7.  Verification  in  Other  Basins 

This  final  form  of  verification  is  one  that  tests  whether  the  model  truly 
represents  a  physically  sound  combination  of  LSEFs.  Earlier  authors  presented 
their  genesis-parameters  as  relevant  to  TC  formations  in  all  tropical  ocean 
basins.  As  a  result,  one  is  left  to  wonder  how  the  model,  as  described  in  Table  1 , 
would  perform  on  fully  independent  data  in  basins  other  than  the  WNP.  Figure 
31  is  an  example  of  a  probability  plot  that  results  when  the  Pacific-trained  model 
is  used  to  generate  probabilities  for  the  North  Atlantic  basin. 
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7  Day  Cummulative  Probabilities:  2006  Pacific  Coefficients  233 


Figure  31 .  Example  of  contoured,  seven-day  summed  probabilities  over  the 
Atlantic  basin,  centered  about  the  233rd  day  (21  August)  of  2006, 
constructed  from  R2  and  OISST  fields  using  the  model  trained  on  the 
WNP.  The  black  dot  indicates  the  verification  point  for  a  TC  that  formed 

on  21  August  2006. 


Quantitative  verification  of  the  storms  that  developed  into  tropical  storms 
or  hurricanes  in  the  Atlantic  during  the  months  of  June  through  November  and 
years  1982-2006  yields  promising  results.  Over  that  period,  hits  number  273  and 
misses  16,  with  a  BSS  of  0.019476  (0.018584. ..0.020306)  and  a  ROCSS  of 
0.58959.  These  positive  results  suggest  that  the  LSEFs  that  influence  TC 
formation  are  the  same  regardless  of  the  ocean  basin.  This  cross-basin 
verification  confirms  what  was  proposed  by  authors  such  as  Gray  and  Frank,  that 
the  same  set  of  LSEFs  influence  TC  formation  regardless  of  the  ocean  basin. 
Though  the  model  is  likely  better  tuned  if  trained  over  the  basin  over  which  it  will 
be  used  as  a  predictive  tool,  this  comparison  suggests  that  one  basic  model  may 
be  skillfully  applied  to  multiple  basins. 
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8.  Model  Shortcomings 

Two  potential  shortcomings  were  identified  in  the  verification  of  the  zero- 
lead  hindcasts.  Both  of  these  shortcomings  deal  with  the  post-formation 
environment.  First,  the  conditions  that  follow  the  formation  day  are  likely  to  be 
represented  by  the  model  as  remaining  favorable  for  formation,  despite  a  TC 
having  already  formed.  The  impacts  of  this  shortcoming  may  be  minimized  by 
noting  that  the  probabilities  represent  the  favorability  of  the  large-scale 
environment  for  TC  formation,  and  if  a  TC  forms  the  high  probabilities  may 
represent  the  likely  track  of  the  storm.  Second,  TCs  may  act  to  enhance  or 
suppress  the  formation  of  other  tropical  cyclones  (Frank  1982).  Due  to  the 
coarse  resolution  of  the  CFS,  it  may  poorly  represent  the  TC-environment 
feedback.  Further  study  would  be  required  to  assess  the  impacts  of  this  second 
shortcoming,  though  such  research  ventures  beyond  the  scope  of  this  thesis. 

C.  VARIATIONS  OF  THE  REGRESSION  MODEL 

The  previous  verification  sections  have  tested  a  model  containing  terms 
for  850  mb  relative  vorticity,  850  mb  relative  vorticity  squared,  SST,  vertical  wind 
shear,  Coriolis  parameter,  and  200  mb  divergence,  and  trained  over  the  peak 
formation  period  for  the  years  1982-2006.  Through  the  course  of  this  thesis 
research,  numerous  forms  of  the  model  were  tested,  in  addition  to  this  final 
model.  For  example,  we  varied  the  training  period  of  the  model,  such  as  training 
the  model  over  the  entire  year  and  over  JASO,  rather  than  just  over  the  peak 
formation  period.  We  also  investigated  the  inclusion  and/or  combination  of  other 
variables  as  noted  earlier  in  Section  I II. A.  Using  the  suite  of  metrics  and 
verification  techniques  listed  in  Chapter  II,  we  selected  the  final  model  from 
among  the  many  tested.  For  the  sake  of  brevity,  only  verification  for  the  final 
model  has  been  presented  in  this  thesis. 

D.  FINDINGS  FROM  CFS  CASE  STUDIES 

The  previous  sections  have  explored  the  validity  of  the  statistical  model  in 

identifying  likely  formation  regions.  The  associated  verification  metrics  represent 

61 


the  potential  skill,  value,  and  applications  as  defined  by  the  zero-lead  hindcasts. 
In  this  section  we  demonstrate  the  ability  to  use  the  CFS  as  the  source  of  LSEF 
values  with  which  to  force  the  regression  model  and  generate  forecast 
probabilities  (see  Figure  18).  The  availability  and  format  of  CFS  output  fields 
negates  the  use  of  many  of  the  quantitative  verification  metrics  that  were 
possible  with  the  reanalysis-based,  zero-lead  hindcasts.  As  a  result,  in  order  to 
investigate  the  predictive  potential  of  the  proposed  technique,  we  will  present  a 
pair  of  case  studies.  The  first  case  study  is  of  a  pair  of  storms  from  2008  using 
operational  CFS  data;  the  data  used  for  Case  1  is  exactly  what  is  readily 
available  on  a  daily  basis,  and  that  could  be  used  to  operationalize  the  method 
proposed  in  this  thesis.  The  second  case  study  is  one  from  2003  using  archived 
CFS  hindcast  fields.  Plots  from  some  additional  case  studies  are  included  in 
Appendix  C. 

1.  Non-Zero  Lead  Hindcasts:  Case  1 

TC  activity  in  the  2008  TC  season  in  the  WNP  was  relatively  low,  for 
reasons  that  are  not  yet  clear.  From  this  low  activity  season,  we  examined  two 
rather  low  intensity  TCs.  Our  model  should  be  robust  enough  to  predict  TCs  in 
low  activity  seasons  and  TCs  that  do  not  reach  high  intensities.  The  only  thing 
that  may  be  notable  about  these  two  TCs,  Mekkhala  (20W)  and  Higos  (21W),  is 
that  the  JTWC  has  traced  their  origins  back  to  the  same  day  in  2008. 

Disturbances  that  would  develop  into  Mekkhala  and  Higos  were  identified 
for  as  early  as  27  September  (see  Figure  32  for  formation  points).  Mekkhala, 
developing  in  the  South  China  Sea,  would  grow  to  tropical  depression  strength 
by  the  following  day,  and  be  a  named  tropical  storm  another  day  later,  on  29 
September.  Similarly,  Higos,  forming  in  the  WNP,  would  reach  tropical 
depression  strength,  and  then  tropical  storm  strength  on  29  September. 
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Ens.  Avg.  7-Day  Probabilities:  Mekkhala  &  Higos,  2-Week  Lead 
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Figure  32.  CFS  ensemble  mean  probabilities  from  runs  initiated  on  13 
September  2008,  valid  24-30  September  2008.  Formation  points  (solid 
circles)  and  tracks  (open  circles)  are  included  for  Mekkhala  (magenta)  and 

Higos  (green). 


Figure  32  is  a  plot  of  the  mean  seven-day  probabilities  from  the  four- 
member  ensemble.  From  this  plot  alone,  it  appears  the  CFS  predicted  the 
potential  for  above-average  TC  activity  in  the  greater  monsoon  trough  region  at  a 
two-week  lead  (tau:  336  hours).  The  difference  plot  in  Figure  33  confirms  that 
the  CFS-based  probabilities  were  higher  at  both  formation  points  than  what 
climatology  would  have  provided. 
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Probability  Difference:  Mekkhala  &  Higos,  2-Week  Lead 


Figure  33.  A  probability  difference  plot  of  the  CFS  ensemble  mean 
probabilities  (as  in  Figure  32)  minus  the  climatological  formation 
probabilities  for  the  same  period.  Formation  points  are  included  for 
Mekkhala  (magenta)  and  Higos  (green). 
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a) 


Member  1  Probabilities:  Mekkhala  &  Higos,  2-Week  Lead 


d)  Member  4  Probabilities;  Mekkhala  &  Higos,  2-Week  Lead 


Figure  34.  Seven-day  probabilities  from  each  of  the  four  ensemble  members, 
initiated  on  13  September  2008,  valid  24-30  September  2008.  Formation 
points  are  included  for  Mekkhala  (magenta)  and  Higos  (green). 


Figure  34  separates  the  ensemble  mean  plot  in  Figure  32  into  individual 
ensemble  members.  Recall  that  the  members  are  identical  models,  but  have 
different  initial  conditions  and/or  initiation  times  (00Z  or  12Z).  These  minor 
variations  between  the  members  do  result,  as  shown  in  Figure  34,  in  pronounced 
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differences  after  a  two-week  integration.  A  quick  comparison  of  these  seven-day 
probabilities  reveals  that  no  one  member  performed  better  than  the  others  for 
both  of  these  storms,  although  member  four  appeared  to  strongly  predict  the 
development  of  Higos. 

The  contoured  probability  plots  like  those  in  Figure  34  represent  summed 
daily  probabilities.  In  addition  to  the  spatial  variability  of  the  individual  members, 
we  could  also  analyze  the  temporal  variations  between  the  members.  This 
additional  degree  of  variability  is  not  shown  in  this  report,  although  the  variations 
are  what  one  would  expect  when  comparing  runs  of  any  dynamical  model — 
timing  differences  exist  from  run-to-run.  The  spatial  and  temporal  variability 
between  members  highlights  what  was  first  mentioned  in  Section  II.B.4,  that  the 
ensemble  approach  smoothes  out  differences  between  the  runs,  and  highlights 
the  more  predictable  elements  of  the  climate  system.  Thus,  this  ensemble 
approach  should  lead  to  enhanced  predictive  skill  overall,  although  there  will,  of 
course,  be  exceptions. 

As  highlighted  earlier,  ensemble  member  four  appeared  to  capture  the 
formation  of  Higos.  We  explored  the  individual  LSEFs  that  contributed  to  the 
probabilities  plotted  in  panel  d  of  Figure  34.  Figure  35  displays  those  LSEFs  for 
the  day  of  formation,  27  September  2008. 
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a)  9/27V08;  850mb  Rel.  Vor.  (/s):  Mekkhala  &  Higos,  2-Week  Lead 
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t>)  9/27/08;  850mb  Vor. *2:  Mekkhala  &  Higos,  2-Week  Lead 
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9/27/08,  SST  (deg.  C):  Mekkhala  &  Higos,  2-Week  Lead 
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9/27/08;  Shear  (m/s):  Mekkhala  &  Higos,  2-Week  Lead 
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e)  9/27/08;  200mb  Div.  (/s):  Mekkhala  &  Higos,  2-Week  Lead 
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Figure  35.  Individual  LSEFs  from  ensemble  member  four  for  the  formation 

day,  27  September  2008. 
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The  panels  in  Figure  35  depict  the  LSEFs  for  the  formation  day  in  the 
order  of  their  statistical  significance  in  the  regression  model:  a)  850  mb  relative 
vorticity,  b)  850  mb  relative  vorticity  squared,  c)  SST,  d)  vertical  wind  shear,  and 
e)  200  mb  divergence.  The  Coriolis  term  is  not  shown,  as  it  is  a  simple  function 
of  latitude,  and  thus  does  not  vary  by  member  or  run.  The  regression  model, 
when  applied  to  variables  from  member  four,  predicted  the  highest  probabilities 
of  formation  for  the  week  centered  on  the  formation  day  to  be  near  10°N  and 
140°E,  very  close  to  the  actual  formation  location.  Though  the  panels  in  Figure 
35  are  for  the  formation  day  alone,  they  reveal  why  the  high  probabilities  are 
predicted  where  they  are.  The  region  surrounding  10°N  and  140°E  is  forecasted 
to  experience  high  low-level  relative  vorticity,  very  warm  SSTs,  near  a  low  shear 
zone,  and  positive  upper-level  divergence. 


a)  9/27/08;  200mb  Wind  (m/s):  Mekkhala  &  Higos,  2-Week  Lead 
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k)  9/27/08,  850mb  Wind  (m/s):  Mekkhala  &  Higos,  2-Week  Lead 
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Figure  36.  Winds  at  a)  200  mb  and  b)  850  mb  from  a  two-week  lead  of 
ensemble  member  four  valid  for  the  formation  day,  27  September  2008. 

The  vorticity,  vorticity  squared,  shear,  and  divergence  terms  included  in 
the  regression  model  are  all  calculated  from  the  zonal  and  meridonal  winds 
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available  from  the  CFS.  Figure  36  depicts  the  200  mb  and  850  mb  winds  from 
member  four  on  formation  day;  the  component  winds  used  to  create  these  full- 
wind  plots  are  the  same  used  to  calculate  the  variables  in  Figure  35,  except  SST. 
Note  the  cyclonic  circulation  at  850  mb  and  anticyclonic  outflow  at  200  mb 
forecasted  by  member  four  for  the  formation  day  at  a  two-week  lead. 

In  this  particular  case  study,  a  well-trained  forecaster  might  have  been 
able  to  use  just  the  CFS  output  fields  (as  in  Figure  36)  to  foresee  the 
development  of  Higos  around  10°N  and  140°E.  Some  readers  may  then 
question,  why  would  one  not  just  use  the  available  CFS  dynamical  output  to 
forecast  tropical  cyclogenesis?  Many  of  the  potential  benefits  of  the  combined 
statistical-dynamical  approach  have  been  noted  implicitly  elsewhere  in  this 
thesis.  We  feel  that  from  the  dynamical  perspective,  employing  an  ensemble 
minimizes  the  impacts  of  spatial  and  temporal  errors  within  the  model.  If  we 
analyzed  member  three,  rather  than  member  four,  in  the  preceding  figures,  one 
would  see  that  both  the  timing  and  strength  of  the  circulation  would  have  been 
inaccurate;  therefore,  a  forecaster  would  have  likely  miss-forecasted  the 
formation  of  Higos.  A  reason  why  operational  numerical  weather  prediction  is 
seldom  used  beyond  ten  days  to  two  weeks  is  that  longer  leads  are  often  beyond 
the  limit  of  predictability  of  individual  weather  elements.  Exploiting  the  expanded 
predictability  of  the  large-scale  circulations  and  ocean  memory  may  extend  the 
predictability  of  this  combined  method,  vice  the  predictability  of  individual 
elements.  Furthermore,  the  regression  model  represents  a  physically-  and 
statistically-sound  combination  of  LSEFs,  which  allows  one  to  produce  a  reliable, 
repeatable  prediction  of  TC  formation.  Rather  than  having  to  intuitively  compare 
multiple  output  fields  and  subjectively  generate  a  forecast,  the  contoured  plots 
from  the  proposed  method  are  easily  generated  and  interpreted  by  forecasters  or 
users.  For  these  reasons,  we  feel  that  this  combined  statistical-dynamical 
method  is  a  viable  approach  to  intraseasonal  prediction  of  tropical  cyclogenesis. 
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Ens.  Mean  (contour)  &  Spread  (fill):  Mekkhala  &  Higos,  2-Week  Lead 
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Figure  37.  Comparison  of  CFS-based  TC  formation  probabilities  in  the  form  of 
an  ensemble  mean/spread  plot  (top)  and  OLR  (bottom)  for  the  same 
period.  OLR  image  provided  by  Physical  Sciences  Division,  Earth  System 
Research  Laboratory,  NOAA,  Boulder,  Colorado,  from  their  Web  site  at 
http://www.esrl.noaa.gov/psd/. 


As  noted  earlier,  in  addition  to  intraseasonal  prediction  of  tropical 
cyclogenesis,  this  method  appears  to  highlight  regions  of  likely  tropical  deep 
convection.  Figure  37  is  a  comparison  of  CFS-based  forecast  probabilities,  in 
the  form  of  an  ensemble  mean/spread  plot,  and  OLR  for  the  period  of  24-30 
September  08.  Whether  verified  against  the  formation  of  Mekkhala  and  Higos, 
difference  from  climatology,  or  against  deep  convection,  the  CFS-based 
probabilities  from  this  case  study  show  promise  for  this  combined  approach  at  a 
lead  time  of  two  weeks. 
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2. 


Non-Zero  Lead  Hindcasts:  Case  2 


As  a  second  case  study  into  the  predictive  potential  based  on  CFS,  we 
focused  on  Ketsana  and  Parma  (20W  and  21W,  respectively),  two  storms  that 
formed  on  18  October  2003.  Rather  than  constructing  a  four-member  ensemble 
from  the  operational  CFS,  we  used  the  archived  ensemble  mean  from  the  CFS 
hindcast  project.  This  ensemble  mean  is  an  average  of  all  15  members 
initialized  in  one  month  from  the  CFS  hindcast  project.  As  a  result,  the  initial 
conditions  of  the  ensemble  mean  are  staggered  over  the  period  of  a  month.  Like 
other  CFS  runs,  the  integrations  extended  out  to  nine  months.  These  ensemble 
mean  runs  are  available  once  per  month  in  the  CFS  archive,  with  the  valid  times 
beginning  on  the  ninth  day  of  every  month.  Thus  we  were  able  to  work  with  a 
nine-day  lead  (tau:  144  hours)  and  a  39-day  lead  (tau:  864  hours)  in  this  case 
study. 
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Figure  38.  Contoured,  seven-day  summed  probabilities,  centered  about  18 
August  2003,  constructed  from  a)  R2  and  OISST  and  b)  R1  and  OISST 
fields.  The  red  dots  indicate  the  formation  points  for  Ketsana  (right)  and 

Parma  (left). 
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This  second  case  study  was  chosen  not  for  its  perceived  performance 
based  on  the  CFS,  but  rather  for  its  unusual  reverse-oriented  monsoon  trough 
and  high  probabilities  visible  in  the  zero-lead  hindcast  (Figure  38).  Figure  38 
displays  the  high  probabilities  that  extend  SW  to  NE  over  the  WNP  when  using 
both  the  R1  and  R2  reanalyses.  The  strong  similarity  between  the  R2-based 
(top)  and  R1 -based  (bottom)  plots  suggest  that  our  model  is  not  overly  sensitive 
to  the  specific  analysis  and  assimilation  system.  The  logical  question  that  follows 
is  whether  the  15-member  CFS  ensemble  mean  would  predict  this  unusual 
activity. 


9-Day  Lead  Probabilities:  Ketsana  &  Parma 


Figure  39.  Contoured,  seven-day  probabilities,  centered  on  18  October  2003, 
constructed  from  the  archived  CFS  ensemble  mean  at  a  nine-day  lead. 
The  red  dots  indicate  the  formation  points  for  Ketsana  and  Parma. 


To  assess  the  predictive  potential,  we  first  investigated  the  nine-day  lead 
forecast.  Figure  39  depicts  the  probabilities  of  TC  formation  for  the  period  15-21 
October  2003,  based  on  archive  CFS  ensemble  mean  fields  with  a  nine-day  lead 
from  the  day  of  formation.  The  formation  points  for  both  Ketsana  and  Parma  are 
included  within  the  0.5%  minimum  contour. 

Figure  40  is  the  same  as  Figure  39,  but  from  fields  with  a  39-day  lead  from 
the  day  of  formation.  While  the  contours  do  suggest  activity  around  15°N,  the 


71 


CFS-based  probabilities  at  such  a  lead  are  notably  different  from  the  reanalysis- 
based,  zero-lead  probabilities  (Figure  38)  and  do  not  indicate  reverse  monsoon 
trough  conditions. 


Figure  40.  Contoured,  seven-day  probabilities,  centered  on  18  October  2003, 
constructed  from  the  archived  CFS  ensemble  mean  at  a  39-day  lead.  The 
red  dots  indicate  the  formation  points  for  Ketsana  and  Parma. 

Visual  comparisons  between  Figure  38  and  Figures  39  and  40  indicate 
differences  between  the  CFS-based  probabilities  and  the  reanalysis-based 
probabilities  in  both  magnitude  and  spatial  distribution.  As  aforementioned,  this 
case  was  chosen,  in  part,  because  of  the  high  probabilities  found  in  the  zero-lead 
hindcast;  both  formation  points  were  predicted  with  probabilities  on  the  order  of 
0.1  or  a  10%  probability.  In  contrast,  the  CFS-based  probabilities  at  the 
formation  points  range  from  approximately  0.004  to  0.013.  Also,  the  reanalysis- 
based  probabilities  depict  favorable  formation  in  a  reverse-oriented  monsoon 
trough  pattern,  while  the  CFS-based  plots  show  a  poleward  extension  of  the 
contoured  probabilities  from  the  climatologically-favored  monsoon  trough  region. 

One  is  left  to  wonder  what  accounts  for  the  difference  between  the  CFS- 
based  and  reanalysis-based  probabilities.  Is  it  a  weakness  of  the  regression 
model  and/or  of  the  CFS?  Is  something  unique  about  this  case  that  is  causing 
these  differences? 
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a)  9-Day  Lead  KETSPARM  Average  850mbWind 
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Figure  41 .  Comparison  of  850  mb  winds  for  the  period  15-21  October  2003, 
from  a)  nine-day  lead  from  the  CFS  ensemble  mean  and  b)  zero-lead  R2 

data.  Note  the  different  scales. 
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b)  R2  KETSPARM  Average  2C0mb  Wind 
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Figure  42.  Comparison  of  200  mb  winds  for  the  period  1 5-21  October  2003, 
from  a)  nine-day  lead  from  the  CFS  ensemble  mean  and  b)  zero-lead  R2 

data. 


73 


Figures  41  and  42  are  comparisons  of  the  850  mb  winds  and  200  mb 
winds,  respectively,  from  averaged  CFS  ensemble  mean  output  (at  a  nine-day 
lead)  and  averaged  R2  data  (at  zero-lead)  for  the  same  period,  15-21  October 
2003.  As  discussed  in  the  case  study  1,  the  magnitude  and  distribution  of  the 
model  probabilities  are  sensitive  to  these  wind  fields.  From  Figures  41-42,  one 
can  start  to  hypothesize  why  the  probabilities  are  different  when  the  regression 
model  is  forced  with  CFS  and  with  R2  LSEF  values.  For  example,  the  850  mb 
winds  (Figure  41)  are  similar  in  direction  in  most  locations  except  the  region 
extending  from  125°E  to  150°E  and  straddling  10°N.  These  robust  westerlies 
indicated  by  the  R2  data,  at  zero  lead,  have  a  profound  impact  on  the  reanalysis- 
based  probabilities,  in  that  they  increase  the  vertical  wind  shear  in  that  region 
and  amplify  low-level  relative  vorticity  to  the  north.  As  a  result,  the  region  125°E 
to  150°E  and  straddling  10°N  is  no  longer  favorable  for  TC  formation,  and 
enhances  the  probability  of  TC  formation  to  the  immediate  north  of  the 
westerlies.  These  westerlies  were  not  predicted  by  the  CFS  fields  at  a  nine-day 
lead;  therefore,  the  climatologically  favored  location  for  TC  genesis  is  not 
displaced.  The  differences  in  the  200  mb  winds  are  not  as  profound.  Overall,  it 
appears  that  temporally  summing  the  bias-corrected  ensemble  mean  fields  tends 
to  smooth  the  CFS  fields  such  that  they  represent  climatology.  In  the  absence  of 
any  other  predictable  elements,  seeing  the  CFS  tend  towards  climatology  is 
reassuring.  This  tendency  is  likely  due  in  part  to  the  bias  correction  we  applied  to 
the  CFS  output. 
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9-Day  Lead  Probabilities:  Ketsana  &  Parma 
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Probability  Difference:  Ketsana  &  Parma,  9-Day  Lead 


Figure  43.  Comparison  of  a)  CFS-based  probabilities  (repeat  of  Fig.  39),  b) 
probability  difference,  and  c)  OLR,  for  the  period  15-21  October  2003. 
The  red  dots  indicate  the  formation  points  for  Ketsana  and  Parma. 


From  this  case  study,  we  observe  that  the  15-member  CFS  hindcast 
ensemble  mean  may  be  too  much  like  climatology  to  yield  formation  probabilities 
that  deviate  greatly  from  climatology.  Despite  the  differences  between  the  R2 
and  CFS-forecasted  850  mb  winds,  the  probability  difference  plot  in  panel  b)  of 
Figure  43  highlights  that  the  model  still  predicts  probabilities  higher  than 
climatology  in  the  region.  In  addition,  a  visual  comparison  between  the  CFS- 
based  probabilities  and  the  OLR  plot  for  the  same  period,  Figure  43  panels  a) 
and  c),  suggests  that  this  period  may  have  been  a  convectively  active  period 
across  much  of  the  WNP,  and  that  the  CFS-based  probabilities  did  a  fair  job  in 
predicting  this  activity. 
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3. 


General  Observations 


The  earlier  sections  on  the  verification  of  the  zero-lead  hindcasts 
established  a  skillful  benchmark  for  evaluating  non-zero  lead  hindcasts  and 
actual  forecasts.  The  two  non-zero  lead  hindcast  case  studies  presented  in  the 
preceding  section  indicate  that  our  combined  statistical-dynamical  method  for 
intraseasonal  prediction  of  regions  favorable  for  tropical  cyclogenesis  has  the 
potential  to  produce  useful  forecasts  from  the  existing  version  of  the  CFS. 

Some  of  the  differences  between  the  CFS-based  probabilities  and  the 
reanalysis-based  probabilities  are  likely  due  to  the  differing  mechanics  of  the  two 
systems.  Though  the  output  we  used  was  at  2.5°  horizontal  resolution  for  both 
systems,  the  effective  portrayal  of  the  assimilated  observational  data  is  different. 
The  R2  assimilates  data  from  a  multitude  of  observational  sources  directly  onto 
its  Gaussian  grid;  therefore,  it  is  conceivable  that  if  a  TC  were  forming  or  present 
over  the  WNP,  the  reanalysis  data  would  represent  the  TC.  While  similar  data  is 
included  into  the  CFS  as  initial  conditions,  as  the  model  is  integrated  forward  in 
time,  the  coarse-resolution  numerics  and  physics  mean  that  the  smaller  scale 
features  in  the  LSEFs  associated  with  TCs  that  are  forming  or  present  will  in 
general  be  less  well  represented  than  in  the  R1  or  R2  fields  that  force  the  zero- 
lead  hindcasts.  Thus,  in  general,  the  CFS  is  likely  to  predict  LSEF  magnitudes 
and  gradients  that  are  weaker  than  those  in  R1  and  R2. 

One  should  recall  that  dynamical  models,  especially  GCMs,  though  based 
on  physical  laws,  are  unable  to  resolve  at  all  spatial  and  temporal  scales  and  are 
sensitive  to  their  often-problematic  parameterizations.  Nevertheless,  it  is 
important  to  remember  that  the  CFS  is  not  a  simplified  physics,  coarse  resolution 
atmospheric  model.  Indeed,  it  is  a  fully  coupled,  one-tier  dynamical  prediction 
system.  With  our  proposed  application,  the  coupling  in  the  CFS  is  rather 
important.  At  short  lead  times,  a  forecast  is  mostly  affected  by  atmospheric  initial 
conditions.  But  at  longer  lead  times,  the  ocean  plays  a  greater  role  and  can 
allow  relatively  high  predictability  in  a  time  averaged  forecasts. 
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We  saw  with  the  first  case  study  that  applying  an  ensemble  approach  to 
the  operational  CFS  may  increase  the  predictability  by  smoothing  out  differences 
between  the  members  and  enhancing  the  more  predictable  elements  of  the 
climate  system.  The  second  case  suggested  that  it  might  be  possible  to  over 
smooth,  by  using  the  archived  ensemble  mean  summed  over  seven  days.  It  was 
promising,  however,  that  the  CFS  appears  to  trend  towards  a  plausible, 
climatological  state,  rather  than  toward  a  model  bias  state. 

The  first  case  study  indicates  that  it  may  be  possible  to  use  raw  output 
fields  from  the  CFS  to  predict  individual  TC  formations.  For  the  aforementioned 
reasons,  we  believe  that  until  the  single-element  predictability  is  increased  in 
dynamical  models,  using  the  raw  output  at  daily  resolutions  will  often  lead  one 
astray  at  intraseasonal  leads.  By  statistically  combining  several  variables  and 
summing  temporally,  the  predictability  is  likely  increased  and  more  reflective  of 
the  large-scale  environment  that  is  known  to  impact  TC  development. 
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IV.  SUMMARY,  CONCLUSIONS,  AND  RECOMMENDATIONS 


A.  KEY  RESULTS  AND  CONCLUSIONS 

This  thesis  is  an  exploration  into  the  viability  of  employing  a  combined 
statistical-dynamical  predictive  method  for  forecasting  TC  formation  probabilities 
at  intraseasonal  time  scales.  The  primary  focus  of  this  work  was  to  assess  the 
feasibility  of  using  such  a  method  to  predict  favorable  regions  for  tropical 
cyclogenesis.  We  also  investigated  whether  this  combined  statistical-dynamical 
approach  appears  to  result  in  skill  and  value  beyond  that  which  basic  climatology 
provides. 

Our  proposed  predictive  method  involves  forcing  a  statistical  model  with 
available  output  from  a  GCM.  We  began  by  investigating  various  atmospheric 
and  oceanic  variables  in  order  to  decide  upon  which  LSEFs,  or  genesis 
parameters,  to  include  as  explanatory  variables  in  our  model.  The  chosen 
statistical  model,  summarized  in  Table  1,  contains  terms  for  850  mb  relative 
vorticity,  850  mb  relative  vorticity  squared,  SST,  vertical  wind  shear,  Coriolis 
parameter,  and  200  mb  divergence.  Each  of  these  variables  was  found  to  be 
necessary,  both  statistically  and  conceptually,  but  together  may  not  be  sufficient 
to  forecast  actual  formation.  Multivariate  logistic  regression  was  used  to  develop 
a  statistical  model  for  the  probability  of  TC  formation  based  on  the  favorability  of 
the  large-scale  environment  as  defined  by  a  linear  combination  of  these  LSEFs. 
As  an  aside,  this  work  with  the  LSEFs  also  suggests  that  the  variable  thresholds, 
as  defined  by  studies  during  the  past  several  decades,  should  be  made  more 
restrictive.  For  example,  the  oft-cited  criterion  that  SST  in  the  WNP  must  be  > 
26.5°C  for  TC  formation  may  be  increased  to  >  28°C  (as  suggested  by  Figure  9). 

The  predictive  potential  of  our  method  was  first  assessed  by  thorough 
quantitative  and  qualitative  verification  of  reanalysis-based,  zero-lead  hindcasts. 
The  model  shows  great  potential,  with  a  BSS  of  0.0291  (0.0282... 0.0299),  a 
ROCSS  of  0.683,  reliable  summed  seven-day  probabilities,  and  potential  added 
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value  for  risk  adverse  customers.  In  addition,  the  zero-lead  hindcasts  performed 
well  in  dealing  with  climate  oscillations,  in  developing  conditional  climatologies,  in 
verification  against  deep  convection,  and  even  in  quantitative  verification  in  the 
Atlantic  basin. 

The  second  assessment  of  the  predictive  potential  of  this  technique  came 
by  way  of  by  two  CFS  case  studies,  where  we  generated  non-zero  lead 
hindcasts  for  past  TCs.  The  availability  and  format  of  CFS  data  confined  much  of 
the  verification  of  these  studies  to  be  qualitative  in  nature.  We  explored  an 
ensemble  approach  as  a  way  to  smooth  out  the  spatial  and  temporal  variability 
between  members,  and  highlight  the  more  predictable  elements.  Both  the 
ensemble  approach  and  the  combination  of  LSEFs  together  lead  to  expanded 
predictability  of  the  large-scale  circulations,  vice  the  limited  predictability  of 
individual  elements.  Results  from  these  intraseasonal-lead  case  studies  are 
promising,  but  also  suggest  much  work  remains  when  it  comes  to  dynamical 
weather  prediction  on  the  intraseasonal  scale.  Purely  dynamical  intraseasonal 
forecasts  are  not  overly  skillful  (van  den  Dool  2007),  so  our  statistical-dynamical 
method  appears  to  be  a  useful  complement  to  existing  alternatives  for 
intraseasonal  forecasting  of  TC  formations. 

Overall,  our  method  provides  a  stable,  reliable,  and  repeatable  approach 
to  intraseasonal  TC  formation  prediction  that  is  applicable  throughout  the  year 
and,  apparently,  in  more  than  just  the  WNP  basin.  Our  method  allows 
forecasters  to  objectively  and  quantitatively  merge  information  about  all  the 
LSEFs  to  produce  an  ensemble  based,  probabilistic  forecast  of  the  potential  for 
TC  formation  and  the  favorability  of  the  climate  system  compared  to  long  term 
mean  climatological  probabilities.  A  single  contoured  plot,  spanning  a  seven-day 
period  is  easy  to  interpret  and  may  even  be  presented  directly  to  users.  A  typical 
rule  of  thumb  in  forecasting  is  to  use  a  numerical  model  only  when  you  have 
confidence  in  its  output.  While  we  agree  with  that  mantra,  we  are  intrigued  by 
the  suggestion  that  the  bias-corrected  CFS  fields  tend  toward  climatology  when 
the  predictability  in  the  climate  system  is  low.  If  such  is  the  case,  this  method 
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could  be  employed  regularly  and  would,  at  the  very  least,  depict  a  probabilistic 
representation  of  TC  formation  climatology. 

The  concept  of  climatology  appears  throughout  this  thesis,  both  as  the 
reference  forecast  against  which  the  proposed  technique  was  judged  and  as  a 
potential  tool  in  itself.  Not  all  climatologies  are  created  equal,  however.  See 
Appendix  A  for  a  brief  discussion  on  the  variations  of  climatology  used  in  this 
work.  In  this  thesis,  the  choice  of  climatology  impacts  the  verification  results. 

Plots  of  the  difference  in  the  probabilities  generated  by  our  method  and 
those  from  climatology  provide  an  intriguing  presentation  of  the  skill  and  value  of 
our  method.  Such  plots  can  be  viewed  as  probability  anomalies  and  clearly 
reveal  where  our  method  predicts  higher  and  lower  likelihood  of  formation  than 
climatology.  Operationally,  a  forecast  for  no  (or  less-likely)  activity  may  be  just 
as  beneficial  as  a  forecast  for  highly-probable  formation.  For  example,  an 
extended  area  of  probabilities  lower  than  climatology  may  suggest  safer  passage 
for  a  carrier  strike  group  wishing  to  transit  the  region. 

Using  the  data  and  methods  outlined  in  Chapter  II,  we  believe  that  the 
model,  as  described  and  verified  in  Chapter  III,  presents  a  viable  approach  to 
intraseasonal  prediction  of  tropical  cyclogenesis.  The  numerous  preceding 
pages  were  presented  not  as  a  testament  to  amount  of  code  written  or  number  of 
variations  tested  in  this  research,  but  rather  as  an  explanation  and  validation  of 
this  combined  statistical-dynamical  approach  in  intraseasonal  TC  prediction. 

B.  APPLICABILITY  TO  DOD  OPERATIONS 

O’Lenic  et  al.  (2007),  in  discussing  recent  developments  in  operational 
long-range  climate  prediction  at  CPC,  state  “improvements  in  the  science  and 
production  methods  of  LRFs  [long-range  forecasts]  are  increasingly  being  driven 
by  users,  who  are  finding  an  increasing  number  of  applications,  and  demanding 
improved  access  to  forecast  information.”  While  this  is  encouraging  and  may  be 
true  in  the  civilian  sector,  we  are  of  the  opinion  that  the  preponderance  of  DoD 
customers  do  not  know  of  what  Air  Force  Weather  (AFW)  and  Navy  METOC 
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communities  are  truly  capable.  As  the  products  and  procedures  of  these  two 
communities  are  driven  by  requirements,  if  customers  do  not  require  a  product,  it 
will  likely  go  uninvestigated. 

The  majority  of  day-to-day  military  scheduling  and  planning  is  focused  on 
operations  and  exercises  that  will  occur  weeks  or  months  later.  Translating  the 
weeks  to  months  of  lead  times  of  the  planning  realm  into  meteorological  terms, 
we  draw  a  parallel  between  the  time  scale  of  military  planning  and  intraseasonal 
forecasts.  In  contrast,  the  preponderance  of  weather  support  provided  by  the 
AFW  and  Navy  METOC  communities  is  focused  on  short-range  forecasting  (lead 
times  of  72  hours  or  less)  or  nowcasting  (lead  times  less  than  three  hours).  This 
indicates  that  weather  support  is  out  of  synch  with  the  majority  of  the  planning 
done  by  its  military  customers. 

Arguably,  the  planning  phase  is  when  weather  support  may  have  the 
greatest  positive  impact  on  military  operations,  by  alerting  planners  to  the 
potential  conditions  that  may  impact  their  operations,  while  the  planners  still  have 
time  to  mitigate  the  impacts  of  some  environmental  conditions  and  exploit  the 
opportunities  provided  by  other  environmental  conditions.  For  planners  of  many 
military  operations,  short-range  forecasts  come  too  late  in  the  process  to  have 
much  influence  on  the  planning.  In  many  of  these  cases,  skillful  long-range 
forecasts  (e.g.,  lead  times  of  two  week  or  longer)  could  be  very  useful  in 
determining  where  and  when  to  conduct  an  operation,  what  assets  and  tactics  to 
employ,  etc.  (personal  communication  CDR  Van  Gurley  2005;  CDR  Tony  Miller 
2009). 

Due  to  a  lack  of  freely  available  forecast  products  at  the  intraseasonal 
scale,  even  an  accessible,  understandable  depiction  of  climatology  or  of  a 
conditional  climatology  has  potential  value  for  military  planners.  The  DoD  lacks 
many  such  a  products.  Previous  theses  (e.g.,  Tournay  2008;  Moss  2007)  and 
sections  from  this  report  highlight  the  power  of  state-of-the-science  climatology, 
or  “smart”  climatology.  Creating  state-of-the-science  climatologies — using  the 
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latest  data  sets,  knowledge  of  climate  oscillations,  etc. — offers  a  significant 
improvement  in  environmental  intelligence  for  DoD  planners. 

Active  intraseasonal  prediction  has  the  potential  to  add  value  beyond 
climatology.  By  exploiting  the  predictability  within  the  climate  system,  via 
statistical,  dynamical,  or  combined  methods,  skillful  weather  information  may  be 
provided  to  military  planners  and  operators.  It  is  important  for  military  centers  to 
undertake  such  prediction  in  addition  to  civilian  centers,  as  the  military  is  often 
focused  on  regions  and  variables  not  covered  by  civilian  products.  For  example, 
civilian  forecasting  centers  generally  focus  on  TC  landfall  locations  or  the  number 
of  TCs  in  a  season.  While  TC  landfall  and  seasonal  counts  are  important,  for  the 
military,  information  at  much  greater  temporal  and  spatial  resolution,  and  over  the 
open  ocean,  would  likely  prove  beneficial.  For  example,  Navy  and  Air  Force 
planners  would  benefit  from  insight  into  periods  and  regions  safe  for  ship  and 
aviation  operations.  The  technique  proposed  in  this  thesis  has  other  benefits  as 
well.  Among  these  benefits  is  that  an  operational  version  of  this  process  could 
be  a  fully-automated  process  that  could  be  delivered  to  forecasters  and 
customers  in  multiple  formats,  to  include  those  via  geographic  information 
systems. 

As  evidenced  by  the  demands  placed  on  civilian  forecast  centers  from 
customers,  one  is  led  to  conclude  that  if  DoD  planners  and  operators  saw  the 
potential  value-added  from  heeding  long-range  weather  intelligence,  they  too 
would  demand  more  of  it.  Products  stemming  from  intraseasonal  predictions 
need  not  be  starkly  different  from  short-term  forecasts  to  which  customers  are 
accustomed.  For  example,  the  ship  avoidance  chart  from  JTWC  (as  in  Figure 
44)  is  routinely  presented  to  operators  for  decision-making.  Potential 
deliverables  from  the  method  proposed  in  this  thesis  could  be  very  similar  to 
such  ship  avoidance  charts.  In  fact,  the  similarity  of  products  would  aid  in 
fostering  seamless  weather  support  for  planning  to  mission  execution  from  the 
users’  perspectives. 
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Figure  44.  Example  JTWC  ship  avoidance  chart  (From 
http://metocph.nmci.navy.mil/jtwc/legend/ship_key.html;  accessed  27 

February  2009). 


Whether  the  mission  is  a  trans-oceanic  air  bridge,  carrier  strike  group  flight 
qualification  training,  or  a  major  multi-national  naval  exercise,  no  current  DoD 
products  exist,  beyond  antiquated  climatology  products,  to  aid  mission  planners 
is  assessing  the  likely  state  of  climate  system  weeks  to  months  in  advance.  The 
method  proposed  in  this  thesis,  and  others  like  it,  could  add  value  for  numerous 
customers,  and  certainly  has  the  potential  for  saving  units’  time  and  tax  dollars. 
This  thesis  represents  a  test  of  this  concept.  We  propose  that  this  and  similar 
products  be  presented  to  customers  to  see  what  applications  and  demands 
emerge  throughout  the  DoD. 
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C.  AREAS  FOR  FURTHER  RESEARCH 

The  previous  sections  have  shown  that  this  approach  demonstrates 
intriguing  potential,  and  that  ample  room  remains  for  further  research  and 
exploration. 

1.  Technique  Exploration 

As  this  was  a  proof  of  concept  for  the  technique,  further  exploration  into 
the  mechanics  of  the  approach  seems  prudent.  The  order  of  the  following  ideas 
for  future  research  does  not  represent  priority. 

1)  Vary  the  regression  model  based  on  end  strength  and/or  growth 
rate  of  the  included  storms.  Preliminary  work  confirms  the  common  thought  that 
not  all  TCs  form  and  behave  in  the  same  manner.  The  method  used  in  this 
thesis  was  founded  on  the  idea  that  compositing  numerous  storms  smooths  out 
the  differences  and  enhances  the  features  in  common.  However,  could  one 
construct  a  more  skillful  model  if  end  strength  and/or  growth  rate  were  taken  into 
consideration? 

2)  As  mentioned  in  Section  III.B.8,  some  of  the  apparent  shortcomings 
of  this  model  deal  with  the  post-formation  environment.  We  were  able  to  mitigate 
these  shortcomings  by  adjusting  the  NTCI,  filtering  out  data  according  to  MSLP 
from  the  model  construction  process,  and  including  the  relative  vorticity  squared 
term.  In  order  to  better  highlight  the  conditions  at  formation,  one  should 
uniformly  define  the  formation  day  in  the  best  track  archive  and  consider 
constructing  a  regression  model  excluding  data  surrounding  the  track  post¬ 
formation. 

3)  Future  research  should  investigate  further  the  best  method  for 
including  NTCI  in  the  development  of  the  regression  model.  This  research 
should  attempt  to  answer  questions  such  as:  To  what  extent  should  NTCI  from 
regions  or  periods  in  which  TCs  have  never  formed  be  used  to  train  the  model? 
Should  all  NTCI  come  just  from  locations  and  months  in  which  TCs  have  been 
observed  to  occur? 
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4)  From  the  available  reanalyses  and  CFS  fields,  we  calculated 
several  of  our  LSEFs  using  second  order  centered  finite  differencing  (see 
Appendix  B).  One  may  consider  using  a  more  advanced  method  to  calculate 
variables,  such  as  Legendre  polynomials  for  meridional  differentiation  and 
Fourier  analysis  for  zonal  spatial  differentiation. 

5)  As  we  observed  with  the  CFS  case  study,  a  delicate  balance 
appears  to  exist  between  predictability  and  resolution  (as  in  the  model’s 
difference  from  climatology).  The  construct  of  the  current  operational  CFS  allows 
one  to  readily  create  a  four-member  ensemble.  While  keeping  the  balance  issue 
in  mind,  one  may  explore  the  idea  of  creating  an  expanded  ensemble  by  using 
runs  initialized  on  multiple  days.  Such  an  approach  would  more  closely  resemble 
the  approach  CPC  takes  in  using  the  CFS  in  seasonal  forecasting. 

6)  A  struggle  throughout  this  thesis  process  concerned  the  issue  of 
how  best  to  verify  the  propensity  for  TC  formation.  Other  centers  with  similar 
spatial  forecasts  of  rare  events  seem  to  struggle  as  well,  and  no  industry 
standard  exists  for  the  verification  of  such  products.  The  approach  we  took  uses 
an  assemblage  of  tools,  most  of  which  inevitably  verify  the  propensity  for 
formation  against  actual  formations.  The  issue  of  verification  needs  to  be 
explored  further.  Could  we  numerically  score  against  OLR  or  some  other 
variable  that  represents  favorable  LSEFs? 

7)  While  numerous  combinations  of  possible  LSEFs  were  tested  for 
inclusion  into  the  regression  equation  in  this  research,  additional  work  could  be 
accomplished  in  this  area.  Ideal  candidates  are  oceanic  variables,  such  as 
mixed  layer  depth.  In  addition,  one  may  consider  additional  non-linear 
relationships  between  variables  and  TC  formation  or  between  separate  variables. 
For  example,  we  experienced  an  improvement  in  our  model’s  performance  by  the 
addition  of  the  vorticity-squared  term. 

8)  Prior  work  by  Meyer  (2007)  and  others  indicate  that  the  same 
LSEFs  that  influence  formation  may  also  influence  the  intensity  of  a  TC. 
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Subsequent  research  may  investigate  the  potential  for  generating  near-term 
estimates  for  the  intensity  of  a  storm  that  has  formed,  or  may  soon  form,  based 
on  the  predicted  conditions  of  the  large-scale  environment. 

2.  Data  Exploitation 

As  mentioned  in  Section  II. B.  regarding  the  importance  of  the  reanalysis 
data  sets,  the  logistic  regression  approach  we  employed  would  not  have  been 
possible  if  the  atmospheric  and  SST  reanalysis  datasets  were  not  available. 
Similarly,  this  approach  would  not  have  been  viable  without  the  existence  of  the 
CFS  data  set,  including  an  extensive  hindcast  archive.  Current  and  forthcoming 
data  sources  offer  potential  avenues  through  which  to  improve  the  combined 
statistical-dynamical  method  proposed  in  this  thesis. 

1)  As  noted  throughout  this  thesis,  the  model  was  trained  on 
reanalysis  data  and  applied  in  proof-of-concept  testing  using  CFS  data.  Though 
it  would  require  a  substantial  storage  and  coding  investment  initially,  one  should 
consider  using  the  CFS  to  both  train  and  test  such  a  model.  In  addition  to 
accounting  for  the  subtle  biases  and  nuances  within  the  model,  this  approach 
would  allow  for  the  testing  of  more  variables — especially  oceanic  variables — 
thought  to  impact  TC  formation.  It  was  not  so  much  the  storage  or  coding  that 
pushed  us  away  from  this  approach  for  this  thesis,  but  rather  the  limited  days  for 
which  hindcast  data  is  available.  Would  enough  storms  be  captured  by  a  purely- 
CFS  approach  to  successfully  train  and  test  a  regression  model?  In  addition,  we 
felt  it  was  important  to  first  use  reanalysis  values  of  the  LSEFs  in  building  the 
regression  model,  so  that  a  relatively  skillful  benchmark  based  on  zero-lead 
hindcasting  could  be  established.  But  future  studies  could  consider  building  a 
regression  model  based  solely  on  forecasted  LSEFs. 

2)  Short  of  using  the  CFS  data,  one  may  consider  employing  an 
ocean  reanalysis,  or  the  forthcoming  coupled  reanalysis  from  NCEP,  to 
investigate  the  use  of  oceanic  LSEFs  other  than  SST.  We  hypothesize  that  a 
term  representing  mixed  layer  depth  may  be  a  more  skillful  predictor  than  SST. 
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In  addition,  such  oceanic  variables  are  known  to  better  represent  long  term 
climate  system  memory.  Thus,  the  use  of  better  or  additional  oceanic  LSEFs 
than  SST  could  provide  a  better  match  between  the  model’s  terms  and  the 
climate  system  variables  that  best  represent  intraseasonal  predictability. 

3)  Low  skill  in  CFS  intraseasonal  predictions  of  atmospheric  variables 
is  likely  a  weak  point  for  our  technique.  While  there  is  no  reason  to  believe  that 
the  current  CFS  is  inferior  at  such  leads  compared  to  other  GCMs,  one  may  find 
it  worthwhile  to  explore  other  GCMs,  such  as  those  from  the  Goddard  Space 
Flight  Center  or  Australian  Bureau  of  Meteorology.  Though  more 
computationally  demanding,  the  most  intriguing  approach  may  be  to  employ  a 
multi-model  ensemble  approach  to  generate  the  necessary  LSEF  fields. 

4)  Future  plans  for  the  CFS  include  an  operational  T126  version. 
Though  we  feel  that  LSEFs  must  occur  over  an  adequate  spatial  and  temporal 
scale  to  affect  TC  formation,  a  higher  resolution  model  may  generate  higher 
magnitudes  and  gradients,  and  more  skillful  predictions  of  the  LSEFs. 
Experimental  runs  by  CPC  of  a  high-resolution  T254  and  T382  CFS  have  shown 
that  it  has  the  potential  to  predict  individual  TCs  and  may  have  skill  in 
characterizing  overall  TC  activity  (Schemm  et  al.  2008).  Undeniably,  a 
comparison  between  a  high-resolution  CFS,  or  comparable  system  (e.g.,  from 
ECMWF),  and  a  lower-resolution  combined  approach  as  proposed  in  this  thesis 
would  be  worthwhile. 
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APPENDIX  A.  VARIATIONS  OF  CLIMATOLOGY 


Climatology  is  used  throughout  this  thesis  as  a  baseline  against  which  we 
compare  our  statistical-dynamical  prediction  method.  Not  all  climatologies  are 
created  equal,  however.  The  following  paragraphs  highlight  the  forms  of 
climatology  applied  in  or  mentioned  in  this  work,  all  of  which  are  legitimate,  but 
distinct,  forms  of  climatology. 

The  most  basic  form  of  climatology  is  sample  climatology.  As  used  in  this 
thesis,  the  sample  climatology  is  the  average  rate  of  occurrence  based  on  the 
verification  dataset.  For  example  if  a  TC  hit  is  observed  10  times  out  of  1,000 
possible  day  grid  points,  the  sample  climatology  would  be  10/1,000  or  0.01.  This 
form  of  climatology  is  used  in  quantitative  verification  such  as  the  BSS. 

We  also  use  various  forms  of  raw  climatology  based  on  the  JTWC  best 
track  data.  Figure  6  in  Chapter  II  is  an  example  plot  of  raw  climatology.  This 
plotted  data  was  created  by  treating  each  of  the  2.5°  x  2.5°  grid  blocks  in  the 
WNP  as  individual  bins.  Looping  through  a  set  period  of  time  (e.g.,  1970  to 
2007),  we  counted  the  number  of  formations  that  occur  in  each  bin,  then  divided 
the  number  in  each  bin  by  the  length  of  time  for  the  given  time  interval.  Based  on 
the  time  interval  one  chooses,  the  output  values  vary  numerically — as  daily, 
weekly,  monthly,  etc.  probabilities — but  the  spatial  distribution  does  not.  A 
shortcoming  of  this  raw  spatial  climatology  is  its  lack  of  day-to-day  variation,  in 
that  the  magnitude  and  distribution  of  daily  probabilities  for  27  March  are  the 
same  as  26  August,  which  we  know  is  not  typically  the  case  in  the  real  climate 
system. 

A  more  robust  version  of  climatology,  still  based  on  the  JTWC  data,  is  one 
that  varies  in  magnitude  throughout  the  year.  This  form  of  climatology  was 
created  by  taking  a  28-day,  Loess-smoothed  form  of  the  daily  observed  TC 
formations  for  the  WNP,  dividing  by  the  number  of  days  in  the  period  to  give  us  a 
daily  probability  that  a  TC  will  form  somewhere  in  the  WNP  on  a  given  day. 
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These  daily  probabilities  were  multiplied  by  a  normalized  spatial  distribution  of 
the  likelihood  of  TC  formation  in  the  WNP.  The  result  is  a  climatology  that 
displays  an  annual  cycle  and  spatial  variation  in  the  output  probabilities.  This  is 
the  form  of  climatology  used  in  creating  the  difference  plots  depicted  in  Chapter 
III  of  this  work. 

Figure  45  is  an  example  of  the  components  involved  in  generating  such  a 
form  of  climatology:  a)  a  smoothed  version  of  daily  formation  counts,  b)  a 
normalized  distribution  of  spatial  climatology,  and  c)  an  example  of  the  resulting 
daily  probabilities  for  1  August.  This  form  of  climatology  vaguely  resembles  the 
approach  taken  by  Leroy  and  Wheeler  (2008),  who  generated  a  climatological 
seasonal  cycle  based  on  raw  probabilities  smoothed  through  harmonic  analysis. 
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Loess  (quadratic  fit)  28-Point  Smoothing  of  Formation  Count 


a) 


Figure  45.  Components  used  to  create  a  more  robust  climatology  against 
which  to  compare  our  method;  a)  a  smoothed  form  of  daily  formation 
counts,  b)  a  normalized  distribution  of  spatial  climatology,  and  c)  an 
example  of  the  resulting  daily  climatology  for  1  August. 


These  preceding  forms  of  climatology  are  all  based  on  the  observational 
JTWC  best  track  data.  An  approach  to  generating  a  pseudo-climatology  is 
mentioned  in  Section  III.B.5.  Rather  than  generating  probabilities  based  on  the 
number  of  TCs  observed  for  a  given  spatial  and  temporal  scale,  this  approach 
uses  the  regression  model  outlined  in  Table  1  to  generate  a  probability  of  TC  at 
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every  grid  point  based  on  LTM  LSEFs.  A  notable  benefit  of  this  approach  is  that 
it  is  not  sensitive  to  the  number  of  TC  formations.  For  example,  if  one  wants  to 
create  climatology  for  the  probability  of  TC  formation  for  a  forthcoming  exercise 
in  the  month  of  May,  a  method  relaying  on  raw  JTWC  data  would  depict  patchy 
probabilities  due  to  the  limited  number  of  storms  (e.g.,  53  in  the  month  of  May  for 
the  years  1970  to  2006).  The  spottiness  of  the  output  would  not  accurately 
reflect  the  large-scale  environment,  but  rather  roughly  contour  the  individual 
storm  formation  points.  In  contrast,  the  LTM  LSEFs  (from  one  of  the  NCEP 
reanalyses)  when  processed  by  our  regression  model  would  result  in  a  depiction 
of  climatology  much  more  indicative  of  the  favorability  of  the  typical  climate 
system  in  the  month  of  May. 

As  noted,  each  of  the  preceding  forms  of  climatology  is  a  different,  but 
legitimate,  approach  to  representing  climatology.  Climatology  is  both  a  useful 
tool  and  a  baseline  reference  forecast.  In  a  situation  where  no  pronounced 
predictable  elements  appear  in  the  climate  system,  a  state-of-the-science 
climatology  may  be  the  best  intraseasonal/seasonal  outlook  one  has  to  offer. 
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APPENDIX  B.  CALCULATION  OF  VARIABLES 


As  only  a  limited  number  of  variables  are  available  at  daily  timesteps  from 
the  CFS,  we  had  to  calculate  additional  variables  based  on  available  model 
output  fields.  In  this  work,  we  employed  second  order  centered  finite  differencing 
for  variables  requiring  spatial  derivatives. 

For  example,  a  variable  directly  representing  vertical  motion  is  not  readily 
available  from  the  CFS.  We  surmise  that  some  degree  of  uplift  would  exist 
(especially  in  and  around  the  monsoon  trough)  if  low-level  convergence  and/or 
upper-level  divergence  exist.  As  such,  we  opted  to — among  other  variables — 
derive  200  mb  divergence  based  on  available  200  mb  zonal  and  meridional  wind 
fields. 


Take  equation  2.21  from  Carlson  (1998),  where  horizontal  divergence  on  a 
fixed  pressure  level  is  given  by: 


Vp.V  = 


(  du  dv ^ 
dx  dy  j 


V  *'  s  p 

Holding  the  area  constant,  to  represent  the  fixed  model  grid  spacing,  the 
horizontal  divergence  in  second  order  centered  finite  difference  form  of 
divergence  at  200  mb  ( D200  )  becomes: 


^200 


U .  ,  -U .  ,  V  —V 

J+ 1  7-1  _|_  /i+l  V  i-\ 


J  200 


2Ax  2Ay 

Where  A00is  the  horizontal  divergence  at  200  mb,  U  is  the  zonal  wind,  V 
is  the  meridional  wind,  Ax  is  the  zonal  (east-west)  grid  spacing,  and  Ay  is  the 
meridional  (north-south)  grid  spacing.  Also,  j  and  i  are  the  longitudinal  and 
latitudinal  indexes,  respectively. 

Then  converting  the  above  equation  into  MATLAB  syntax,  the  equation  for 
200  mb  divergence  for  an  array  of  size  (41 ,144,365)  becomes: 
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for  i  =  2:40 
for  j  =  2:143 
for  k  =  1 :365 

dy  =  1 1 1319.49*2.5;  %  Spacing  in  meters,  based  on  WGS-84 

dx(i)  =  cosd(i)*1 11319.49*2.5; 
dudx(i,j,k)  =  (U_200(i,j+1,k)  -  U_200(i,j-1  ,k))/(2*dx(i)); 
dvdy(i,j,k)  =  (V_200(i-1  ,j,k)  -  V_200(i+1  ,j,k))/(2*dy); 
end 
end 
end 

DIV_200  =  dudx+dvdy;  %  Divergence  at  200mb;  s'1 

Note  that  MATLAB  indexes  top  to  bottom,  thus  requiring  an  opposite 
convention  on  the  latitudinal  index.  Also,  U_200  and  V_200  are  predefined 
variables  representing  three-dimensional  arrays  of  the  200mb  zonal  and 
meridional  winds,  respectively. 

With  this  spatial  finite  differencing,  we  could  just  as  easily  used  fourth 
order  finite  differencing  methods.  With  the  model  output  variables  from  which 
such  additional  variables  are  calculated  being  at  2.5°  horizontal  resolution,  we 
felt  that  fourth  order  methods  would  overly  smooth  the  gradients.  Figure  46  is  a 
comparison  of  second  order  versus  fourth  order  finite  differencing  for  200  mb 
divergence  for  a  sample  day  in  1991 . 
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a)  200mb  Divergence:  2nd  Order,  R2  1991_220 
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Figure  46.  Comparison  of  a)  second  order  finite  differencing  and  b)  fourth 
order  finite  differencing  for  200  mb  divergence  on  8  August  1991, 
constructed  from  R2  wind  fields.  Panel  c)  is  the  difference  between  a)  and 
b).  Note  the  different  scales  between  the  divergence  and  difference  plots. 


As  noted  in  Section  IV. C.  1.3).,  the  calculation  of  additional  variables  from 
the  available  model  output  fields  is  an  area  open  to  further  research.  While  the 
second  order  centered  finite  differencing  allows  us  to  readily  calculate  several 
variables  that  are  based  on  spatial  derivatives,  the  five-degree  “reach”  about 
each  grid  point  does  result  in  some  gradient  loss  versus  what  we  might  get  if 
such  variables  were  directly  predicted  by  the  CFS. 


95 


THIS  PAGE  INTENTIONALLY  LEFT  BLANK 


96 


APPENDIX  C.  ADDITIONAL  CASE  STUDIES 


Operational  CFS  Cases 

As  an  additional  resource  for  the  reader,  this  section  includes  additional 
plots  for  storms  that  occurred  in  the  WNP  during  the  fall  of  2008.  The  probability 
plots  that  follow  are  based  on  the  operational  CFS,  and  thus  are  generated  from 
the  four-member  ensemble.  The  construct  of  these  cases  mirrors  Case  1  in 
Section  III.D.1 . 

The  genesis  of  Jangmi  (19W)  may  be  traced  back  to  24  September  2008. 
Due  to  the  limited  availability  of  daily  operational  CFS  fields,  the  lead  time  for  this 
case  is  limited  to  a  four-day  lead.  Figure  47  depicts  the  seven-day  summed 
probabilities  at  a  four-day  lead  and  a  comparison  composite  OLR  plot  for  the  day 
seven-day  period. 


a)  Ensemble  Average  7-Day  Probability:  JANGMI  4-Day  Lead 
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Figure  47.  Comparison  of  a)  CFS-based  TC  formation  probabilities  from  the 
ensemble  mean  at  a  4-day  lead  and  b)  OLR  for  the  period  of  21-27 
September  2008.  The  formation  point  and  storm  track  is  marked  by  the 
green  dot  and  magenta  circles,  respectively. 
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Maysak  (24W)  was  a  weak  storm  whose  origins  may  be  traced  back  to  5 
November  2008.  Figure  48  depicts  the  seven-day  summed  probabilities  at  a 
two-week  lead,  at  a  three-week  lead,  and  a  comparison  composite  OLR  plot  for 
the  day  seven-day  period. 
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Figure  48.  Comparison  of  a)  CFS-based  TC  formation  probabilities  from  the 
ensemble  mean  at  a  2-week  lead,  b)  at  a  3-week  lead,  and  c)  OLR  for  the 
period  of  2-8  November  2008  The  formation  point  and  storm  track  is 
marked  by  the  green  dot  and  magenta  circles,  respectively. 


As  a  late  season  storm  with  unusual  formation  dynamics,  Dolphin  (27W) 

makes  an  interesting  case  study.  JTWC  notes  the  beginnings  of  Dolphin  as  early 

as  8  December  2008.  Figure  49  displays  the  seven-day  summed  probabilities  at 
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a  two-week  lead,  at  a  three-week  lead,  and  a  comparison  composite  OLR  plot  for 
the  day  seven-day  period  about  which  the  probability  plots  are  centered. 


Figure  49.  Comparison  of  a)  CFS-based  TC  formation  probabilities  from  the 
ensemble  mean  at  a  2-week  lead,  b)  at  a  3-week  lead,  and  c)  OLR  for  the 
period  of  5-1 1  December  2008  The  formation  point  and  storm  track  is 
marked  by  the  green  dot  and  magenta  circles,  respectively. 


Hindcast  CFS  Cases 

In  contrast  to  the  above  cases  that  were  based  on  daily,  operational  CFS 
output,  the  cases  in  this  section  are  based  on  archived  hindcast  CFS  data. 
Though  archived  data  is  used,  the  lead  times  are  still  true-to-form,  thus  the 
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probability  plots  are  contours  of  probabilities  based  on  forecast  variable  fields. 
The  data  used  for  generating  these  case  studies  mirrors  the  15-member 
ensemble  mean  data  used  for  Case  2  in  Section  III.D.2. 

Jelawat  (13W)  formed  on  31  July  2000,  in  a  location  well  removed  from 
the  climatologically  favored  formation  regions.  Figure  50  provides  a  visual 
comparison  between  the  CFS-based  probabilities  and  OLR  over  the  same 
seven-day  period. 
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Figure  50.  Comparison  of  a)  CFS-based  TC  formation  probabilities  from  the 
ensemble  mean  at  a  22-day  lead  and  b)  OLR  for  the  period  of  28  July  -  3 
August  2000.  The  formation  point  for  Jelawat  is  highlighted  by  the 

magenta  dot. 


JTWC  lists  the  formation  day  for  Krosa  (24W)  as  3  October  2001 .  Figure 
51  offers  a  visual  comparison  between  the  seven-day  summed  CFS-based 
probabilities  centered  on  3  October  2001,  based  on  the  15-member  ensemble 
mean,  and  OLR  over  the  same  seven-day  period. 
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Figure  51 .  Comparison  of  a)  CFS-based  TC  formation  probabilities  from  the 
ensemble  mean  at  a  24-day  lead  and  b)  OLR  for  the  period  of  30 
September  -  6  October  2001 .  The  formation  point  for  Krosa  is  marked  by 

the  magenta  dot. 

As  a  final  case  study,  Mindulle  (10W)  formed  on  21  September  2004.  The 
panels  in  Figure  52  represent  a)  the  CFS-based  probabilities  from  a  12-day  lead, 
b)  the  CFS-based  probabilities  from  a  43-day  lead,  and  c)  the  NOAA  interpolated 
OLR  image  from  the  same  period.  The  OLR  images  displayed  in  this  appendix 
and  throughout  this  thesis  are  courtesy  of  the  Physical  Sciences  Division,  Earth 
System  Research  Laboratory,  NOAA,  Boulder,  Colorado,  from  their  Web  site  at 
http://www.esrl.noaa.gov/psd/. 
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12-Day  Lead  Probability:  MINDULLE 
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Figure  52.  Comparison  of  a)  CFS-based  TC  formation  probabilities  from  the 
ensemble  mean  at  a  1 2-day  lead,  b)  a  43-day  lead,  and  c)  OLR  for  the 
period  of  18-24  June  2004.  The  formation  point  for  Mindulle  is  marked  by 

the  magenta  dot. 
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